Cloud customers pay an average three times more on cloud compute costs for AWS, Microsoft Azure and Google Cloud than they should, according to Cast AI. Helping them manage those costs is turning into a business itself.
The startup specializes in Kubernetes automation and cost optimization and reporting for cloud-native applications. Its platform uses artificial intelligence to identify which compute resources are needed for specific Kubernetes workloads and automatically selects the best combinations, configuring CPUs and memory to prevent over-provisioning. It continuously adds or removes resources as needed, ensuring customers aren’t overspending without compromising workload availability or performance, according to the company.
“It's impossible to do this exercise as a human,” co-founder and Chief Product Officer Laurent Gil said. “We decomplexify capabilities. We make Kubernetes or containers serverless by saying we're going to take care of the servers, and we will make the servers cost-efficient.”
Cast AI was born out of its co-founders’ frustrations with their cloud bills while they operated a prior startup: Zenedge, a cloud-based, AI-driven cybersecurity startup acquired by Oracle in 2018.
“In the beginning of that company, I would spend about $1,000 to $2,000 a month on AWS,” Gil said. “Three years later … that became $2 million dollars — by far the highest cost of the company, and we were very, very frustrated. We had a nice ride with customers, but every time we would add a client, our AWS bill would go through the roof.”
AWS’ answer was for Zenedge to prepay for three years to cut their cloud bill by 40%, but Zenedge didn’t want to be locked in, according to Gil. With Cast AI, they built the spending-management product they wished they had at the time.
The overspending tax
Companies using Cast AI’s services can reduce their cloud compute spending by 65% on average, according to Gil. Those services work with Amazon Elastic Kubernetes Service (EKS), Google Kubernetes Engine (GKE), Azure Kubernetes Service (AKS) and Kubernetes Operations (kOps) on AWS.
“The engine is instantly going to understand what applications you have … and how much compute and memory they currently consume, and how much they cost to run based on the machine that these applications are installed on,” Gil told Protocol. “Then we are going to give you another number, which is, ‘Hey, considering what this application does and uses, this should really be the cost.’”
While there are no big differences between the Big Three cloud providers’ prices, Gil said, within each cloud itself, there are cost differences when it comes to processors.
“Most … are cheaper with AMD than they are with Intel,” Gil said. “That makes our engine use more AMD sometimes for compute-intensive [workloads]. But the machine has been trained to know this, so we will always select the lowest-cost option.”
Image: Cast AI
Cast AI is currently optimizing about 1,000 applications for hundreds of customers, according to Gil.
“One thing that was very surprising to us … is that the average cost-savings we provide to anybody using us … is 65%,” he said. “Sixty-five percent means you are spending three times more than you should on Amazon. So if you think of this the other way, you say, 'Well, out of $100 of your cloud bill, $66 of this is for Mr. Bezos, because it does nothing for you … and $33 is what you really use.'”
Cast AI says that, on average, its customers weren’t using 37% of the CPUs that they were paying for. They could save an additional 7% by changing one type of virtual machine (VM) for another and another 22% by switching VMs to discounted spot instances.
“We're not changing anything [with] our customer environment,” Gil said. “It's like … how you defragment disk drives. We defragment your application by moving the boxes around so that you can fill the machine more [by] using all the empty space.”
This is the way
It’s a task that’s impossible for developers to tackle on their own, and the cloud providers don’t make it easy, according to Gil.
One of Cast AI’s customers — an adtech company with a large consumer app in India — saw 84% in compute savings after turning on its engine, according to Gil. Another publicly traded company, a SaaS business, saw its cloud compute costs reduced by 72%.
Branch, a late-stage startup specializing in deep linking, mobile analytics and attribution, is a Cast AI customer that sees about 25 billion events per day and is running all of its compute inside Kubernetes clusters.
“Our cloud hosting needs to be very efficient to be able to process all that data in real time to be able to make real-time decisions … as well as to be able to aggregate and show all of the statistics inside of the analytics,” said Mark Weiler, Branch’s head of Engineering.
Branch, which uses AWS as its preferred cloud provider, started a proof of concept with Cast AI in May of 2021 and deployed it across all of its clusters within two months.
“They have saved us on the order of a couple million dollars per year on our AWS cloud bill, which is one of the highest ROI cost-savings projects that we've done in the past five or six years,” Weiler said. “The promise was they would allow us to dynamically determine what sorts of optimal spot instances to use based on our workloads without incurring any negative effects on our uptime SLAs [service-level agreements] when Amazon revokes those instances. They came through.”
“Manually configuring all that, keeping that up to date, having all the fallback scenarios set up and up to date, is extremely complicated to do on your own. It's begging for an automated solution that can monitor the actual spot market and your instances and determine what the optimal reallocation would be,” Weiler said.
Cast AI is currently adding new features for observability and cost-reporting, but Gil sees an opportunity to even further reduce other areas of customers’ cloud bills.
“We’re just scratching the surface,” he said.