By 

It matters more than you might think

Generative AI is capturing headlines and generating buzz from seemingly all corners, thanks to tools like ChatGPT which have the potential to completely transform vast segments of white-collar work across a variety of knowledge industries.

Not every organization is equally well positioned to capitalize on these developments, however: Companies with on-prem tech products might be left out of the generative AI revolution.

It’s increasingly clear that if companies hope to leverage this powerful new technology, they will need to be on cloud platforms. From technical reasons to more financial-based rationale, to simply a faster speed of innovation, the evidence overwhelmingly points in one direction: the cloud.

Taming technical complexity

One of the biggest reasons why generative AI lends itself to the cloud is because of the complexity of the AI software itself and the supporting software stack.

The transformer technology that underpins generative large language models such as ChatGPT is fairly “young” – its development can be traced back to a research paper released by Google in 2017. The fact that this is a relatively “new” technology means that it is still in a state of flux, unlike, for instance, something like database technology which has been around since the 1960s and is fairly “solid” at this point.

It’s not just the science behind software of the large language models themselves that is complex – so is the stack required to run the models. Generative AI is built on technologies like Python, Rust, Linux, and Docker containers – none of which are defaults in most corporate IT environments. Add it all together, and you have a very young software stack with lots of complexity and moving parts that runs on a software ecosystem that is not familiar to large parts of the corporate world.

If this were the only reason for running generative AI in the cloud rather than on prem, it would likely be reason enough all on its own. But aside from IT complexity, there’s also the issue of cost…

What would a cost benefit analysis reveal?

Typically, the use of AI is a “burst” scenario. Most of the time, the service is lying idle, but then a request comes in – for example, a user initiates a ChatGPT-type back-and-forth interaction – and at that point, significant compute is required.

If a company were to run this generative AI on prem, they would have compute power sitting there doing nothing for large stretches of time. From a cost perspective, the cloud model of only paying for compute power when needed makes more sense.

There’s also a cost benefit analysis to be made around hardware. The latest forms of AI require specialized hardware to run – notably, GPUs (graphics processing units). Although they originally came from the world of gaming, GPUs are so good at performing massive parallel matrix computations that they’re also used for applications like generative AI.

These GPUs aren’t just expensive, they’re also very specialized in their functionality – there’s not much a firm can use them for aside from maybe gaming or crypto mining, neither of which is likely part of most organizations’ core mission. While it is of course possible to purchase this specialized hardware and install it in servers on prem, it’s a lot more sensible to pay for the minutes or hours they’re actually needed via the cloud.

On the subject of costs, it’s also hard to avoid the costs of generative AI on the environment. As alluded to earlier, AI is very compute-intensive, and all those GPUs come with significant energy consumption and cooling requirements.

Cloud is a way to mitigate this cost. To be clear, AI will still consume a lot of energy no matter if it’s run on prem or in the cloud. But cloud providers like Microsoft Azure are pushing for 100% renewable energy by 2025, and to be water positive by 2023 – and Google and Amazon have similar ambitions, which is another finger on the scale for running generative AI in the cloud.

Eliminating friction, easing concerns

In addition to removing the technical and financial burdens of trying to support generative AI on prem, running it in the cloud leads to a faster pace of innovation. Essentially, all a company has to do is plug in their credit card details and start using it. Rather than spending 90% of their time on the infrastructure to keep it up and running, and 10% of their time figuring out how to benefit from AI for competitive advantage, they can flip the model – and start innovating.

Other hurdles that have traditionally gotten in the way of people moving to the cloud include security concerns about the safety of putting data in the cloud, or unease around data jurisdiction requirements or data leaving a specific geographic area.

These concerns have all been addressed by the major cloud providers. On the security front, it’s well understood at this point that putting your data in the cloud is incredibly safe, due to the tremendous amount of dedicated resources that the major cloud providers are able to put towards security. They also provide fine-grained control over where data resides, even when a service like AI processing is taking place, further eliminating concerns that have historically prevented companies from embracing cloud.

Got a spare $4.5 million lying around?

It’s estimated that a single training run of the GPT-3 model – where the large language model is fed vast amounts of data in order to build the worldview that it relies on to generate answers – had cost OpenAI around $4.5 million, putting this undertaking out of reach for all but the largest or most well-resourced enterprises. (Note that that eye watering amount is just the compute cost for training; it doesn’t include the cost of the engineers who develop the model).

It’s true that once you have those models, there’s an ability to distill them down to smaller models that are cheaper to train and require a lot less compute power – but that initial training run is a huge hurdle that few will be able to clear.

For that reason, as well as the myriad others previously detailed, all signs indicate that for the near- to mid-term, if companies want to take advantage of generative AI and all the potential benefits it offers, cloud will be the best way to consume that service, offering the most advantageous path forward.

Follow this link to the article as it was published…