A quick-service restaurant chain is running its AI models on machines inside its stores to localize delivery logistics. At the same time, a global pharma company is training its machine learning models on premises, using servers it manages by itself.
Cloud computing isn’t going anywhere, but some companies that use machine learning models and the tech vendors supplying the platforms to manage them say machine learning is having an on-premises moment. For many years, cloud providers have argued that the computing requirements for machine learning would be far too expensive and cumbersome to start up on their own, but the field is maturing.
“We still have a ton of customers who want to go on a cloud migration, but we're definitely now seeing — at least in the past year or so — a lot more customers who want to repatriate workloads back onto on-premise because of cost,” said Thomas Robinson, vice president of strategic partnerships and corporate development at MLOps platform company Domino Data Lab. Cost is actually a big driver, said Robinson, noting the hefty price of running computationally intensive deep-learning models such as GPT-3 or other large-language transformer models, which businesses today use in their conversation AI tools and chatbots, on cloud servers.
There's more of an equilibrium where they are now investing again in their hybrid infrastructure.
The on-prem trend is growing among big box and grocery retailers that need to feed product, distribution and store-specific data into large machine learning models for inventory predictions, said Vijay Raghavendra, chief technology officer at SymphonyAI, which works with grocery chain Albertsons. Raghavendra left Walmart in 2020 after seven years with the company in senior engineering and merchant technology roles.
“This happened after my time at Walmart. They went from having everything on-prem, to everything in the cloud when I was there. And now I think there's more of an equilibrium where they are now investing again in their hybrid infrastructure — on-prem infrastructure combined with the cloud,” Raghavendra told Protocol. “If you have the capability, it may make sense to stand up your own [co-location data center] and run those workloads in your own colo, because the costs of running it in the cloud does get quite expensive at certain scale.”
Some companies are considering on-prem setups in the model building phase, when ML and deep-learning models are trained before they are released to operate in the wild. That process requires compute-heavy tuning and testing of large numbers of parameters or combinations of different model types and inputs using terabytes or petabytes of data.
“The high cost of training is giving people some challenges,” said Danny Lange, vice president of AI and machine learning at gaming and automotive AI company Unity Technologies. The cost of training can run into millions of dollars, Lange said.
“It’s a cost that a lot of companies are now looking at saying, can I bring my training in-house so that I have more control on the cost of training, because if you let engineers train on a bank of GPUs in a public cloud service, it can get very expensive, very quickly.”
Companies shifting compute and data to their own physical servers located inside owned or leased co-located data centers tend to be on the cutting edge of AI or deep-learning use, Robinson said. “[They] are now saying, ‘Maybe I need to have a strategy where I can burst to the cloud for appropriate stuff. I can do, maybe, some initial research, but I can also attach an on-prem workload.”
If you let engineers train on a bank of GPUs in a public cloud service, it can get very expensive, very quickly.
Even though the customer has publicized its cloud-centric strategy, one pharmaceutical customer Domino Data Lab works with has purchased two Nvidia server clusters to manage compute-heavy image recognition models on-prem, Robinson said.
High cost? How about bad broadband
For some companies, a preference for running their own hardware is not just about training massive deep-learning models. Victor Thu, president at Datatron, said retailers or fast-food chains with area-specific machine learning models — used to localize delivery logistics or optimize store inventory — would rather run ML inference workloads in their own servers inside their stores, rather than passing data back and forth to run the models in the cloud.
Some customers “don’t want it in the cloud at all,” Thu told Protocol. “Retail behavior in San Francisco can be very different from Los Angeles and San Diego for example,” he said, noting that Datatron has witnessed customers moving some ML operations to their own machines, especially those retailers with poor internet connectivity in certain locations.
Model latency is a more commonly recognized reason to shift away from the cloud. Once a model is deployed, the amount of time it takes for it to pass data back and forth between cloud servers is a common factor in deciding to go in-house. Some companies also avoid the cloud to make sure models respond rapidly to fresh data when operating in a mobile device or inside a semi-autonomous vehicle.
“Often the decision to operationalize a model on-prem or in the cloud has largely been a question of latency and security dictated by where the data is being generated or where the model results are being consumed,” Robinson said.
Over the years, cloud providers have overcome early perceptions that their services were not secure enough for some customers, particularly those from highly regulated industries. As big-name companies such as Capital One have embraced the cloud, data security concerns have less currency nowadays.
Still, data privacy and security does compel some companies to use on-prem systems. AiCure uses a hybrid approach in managing data and machine learning models for its app used by patients in clinical trials, said the company’s CEO Ed Ikeguchi. AiCure keeps processes involving sensitive, personally identifiable information (PII) under its own control.
“We do much of our PII-type work locally,” Ikeguchi said. However, he said, when the company can use aggregated and anonymized data, “then all of the abstracted data will work with cloud.”
Ikeguchi added, “Some of these cloud providers do have excellent infrastructure to support private data. That said, we do take a lot of precautions on our end as well, in terms of what ends up in the cloud.”
“We have customers that are very security conscious,” said Biren Fondekar, vice president of customer experience and digital strategy at NetApp, whose customers from highly regulated financial services and health care industries run NetApp’s AI software in their own private data centers.
Big cloud responds
Even cloud giants are responding to the trend by subtly pushing their on-prem products for machine learning. AWS promoted its Outposts infrastructure for machine learning last year in a blog post, citing decreased latency and high data volume as two key reasons customers want to run ML outside the cloud.
“One of the challenges customers are facing with performing inference in the cloud is the lack of real-time inference and/or security requirements preventing user data to be sent or stored in the cloud,” wrote Josh Coen, AWS senior solutions architect, and Mani Khanuja, artificial intelligence and machine learning specialist at AWS.
In October, Google Cloud announced Google Distributed Cloud Edge to accommodate customer concerns about region-specific compliance, data sovereignty, low latency and local data processing.
Microsoft Azure has introduced products including its Azure Arc services to help customers take a hybrid approach to managing machine learning by running ML models in data centers or at the edge, and validating and debugging models on local machines, then deploying them in the cloud.
Snowflake, which is integrated with Domino Data Lab’s MLOps platform, is mulling more on-prem tools for customers, said Harsha Kapre, senior product manager at Snowflake. “I know we're thinking about it actively,” he told Protocol. Snowflake said in July that it would offer its external table data lake architecture — which can be used for machine learning data preparation — for use by customers on their own hardware.
“I think in the early days, your data had to be in Snowflake. Now, if you start to look at it, your data doesn't actually have to be technically [in Snowflake],” Kapre said. “I think it’s probably a little early” to say more, he added.
As companies integrate AI across their businesses, more and more people in an enterprise are using machine learning models, which can run up costs if they do it in the cloud, said Robinson. “Some of these models are now used by applications with so many users that the compute required skyrockets and it now becomes an economic necessity to run them on-prem,” he said.
But some say the on-prem promise has hidden costs.
“The cloud providers are really, really good at purchasing equipment and running it economically, so you are competing with people who really know how to run efficiently. If you want to bring your training in-house, it requires a lot of additional cost and expertise to do,” Lange said.
Bob Friday, chief AI officer at communications and AI network company Juniper Networks, agreed.
“It’s almost always cheaper to leave it at Google, AWS or Microsoft if you can,” Friday said, adding that if a company doesn’t have an edge use-case requiring split-second decision-making in a semi-autonomous vehicle, or handling large streaming video files, on-prem doesn’t make sense.
But cost savings are there for enterprises with large AI initiatives, Robinson said. While companies with smaller AI operations may not realize cost benefits by going in-house, he said, “at scale, cloud infrastructure, particularly for GPUs and other AI-optimized hardware, is much more expensive,” he said, alluding to Domino Data Lab’s pharmaceutical client that invested in Nvidia clusters “because the cost and availability of GPUs was not palatable on AWS alone.”
Everybody goes to the cloud, then they sort of try to move back a bit. I think it's about finding the right balance.
Robinson added, “another thing to take into consideration is that AI-accelerated hardware is evolving very rapidly and cloud vendors have been slow in making it available to users.”
In the end, like the shift toward multiple clouds and hybrid cloud strategies, the machine learning transition to incorporate on-prem infrastructure could be a sign of sophistication among businesses that have moved beyond merely dipping their toes in AI.
“There's always been a bit of a pendulum effect going on,” Lange said. “Everybody goes to the cloud, then they sort of try to move back a bit. I think it's about finding the right balance.”