The average cloud customer is paying 65% more for Kubernetes compute services than needed

Cast AI uses Kubernetes automation technology to optimize spending and performance for cloud-native apps by matching the right amount of computing power and memory to those apps.

Cast AI co-founder and Chief Product Officer Laurent Gil

Cast AI was born out of its co-founders’ frustrations with their cloud bills while they operated a prior startup.

Photo: Cast AI

Cloud customers pay an average three times more on cloud compute costs for AWS, Microsoft Azure and Google Cloud than they should, according to Cast AI. Helping them manage those costs is turning into a business itself.

The startup specializes in Kubernetes automation and cost optimization and reporting for cloud-native applications. Its platform uses artificial intelligence to identify which compute resources are needed for specific Kubernetes workloads and automatically selects the best combinations, configuring CPUs and memory to prevent over-provisioning. It continuously adds or removes resources as needed, ensuring customers aren’t overspending without compromising workload availability or performance, according to the company.

“It's impossible to do this exercise as a human,” co-founder and Chief Product Officer Laurent Gil said. “We decomplexify capabilities. We make Kubernetes or containers serverless by saying we're going to take care of the servers, and we will make the servers cost-efficient.”

Cast AI was born out of its co-founders’ frustrations with their cloud bills while they operated a prior startup: Zenedge, a cloud-based, AI-driven cybersecurity startup acquired by Oracle in 2018.

“In the beginning of that company, I would spend about $1,000 to $2,000 a month on AWS,” Gil said. “Three years later … that became $2 million dollars — by far the highest cost of the company, and we were very, very frustrated. We had a nice ride with customers, but every time we would add a client, our AWS bill would go through the roof.”

AWS’ answer was for Zenedge to prepay for three years to cut their cloud bill by 40%, but Zenedge didn’t want to be locked in, according to Gil. With Cast AI, they built the spending-management product they wished they had at the time.

The overspending tax

Companies using Cast AI’s services can reduce their cloud compute spending by 65% on average, according to Gil. Those services work with Amazon Elastic Kubernetes Service (EKS), Google Kubernetes Engine (GKE), Azure Kubernetes Service (AKS) and Kubernetes Operations (kOps) on AWS.

“The engine is instantly going to understand what applications you have … and how much compute and memory they currently consume, and how much they cost to run based on the machine that these applications are installed on,” Gil told Protocol. “Then we are going to give you another number, which is, ‘Hey, considering what this application does and uses, this should really be the cost.’”

While there are no big differences between the Big Three cloud providers’ prices, Gil said, within each cloud itself, there are cost differences when it comes to processors.

“Most … are cheaper with AMD than they are with Intel,” Gil said. “That makes our engine use more AMD sometimes for compute-intensive [workloads]. But the machine has been trained to know this, so we will always select the lowest-cost option.”

Cast AI savingsImage: Cast AI

Cast AI is currently optimizing about 1,000 applications for hundreds of customers, according to Gil.

“One thing that was very surprising to us … is that the average cost-savings we provide to anybody using us … is 65%,” he said. “Sixty-five percent means you are spending three times more than you should on Amazon. So if you think of this the other way, you say, 'Well, out of $100 of your cloud bill, $66 of this is for Mr. Bezos, because it does nothing for you … and $33 is what you really use.'”

Cast AI says that, on average, its customers weren’t using 37% of the CPUs that they were paying for. They could save an additional 7% by changing one type of virtual machine (VM) for another and another 22% by switching VMs to discounted spot instances.

“We're not changing anything [with] our customer environment,” Gil said. “It's like … how you defragment disk drives. We defragment your application by moving the boxes around so that you can fill the machine more [by] using all the empty space.”

This is the way

It’s a task that’s impossible for developers to tackle on their own, and the cloud providers don’t make it easy, according to Gil.

One of Cast AI’s customers — an adtech company with a large consumer app in India — saw 84% in compute savings after turning on its engine, according to Gil. Another publicly traded company, a SaaS business, saw its cloud compute costs reduced by 72%.

Branch, a late-stage startup specializing in deep linking, mobile analytics and attribution, is a Cast AI customer that sees about 25 billion events per day and is running all of its compute inside Kubernetes clusters.

“Our cloud hosting needs to be very efficient to be able to process all that data in real time to be able to make real-time decisions … as well as to be able to aggregate and show all of the statistics inside of the analytics,” said Mark Weiler, Branch’s head of Engineering.

Branch, which uses AWS as its preferred cloud provider, started a proof of concept with Cast AI in May of 2021 and deployed it across all of its clusters within two months.

“They have saved us on the order of a couple million dollars per year on our AWS cloud bill, which is one of the highest ROI cost-savings projects that we've done in the past five or six years,” Weiler said. “The promise was they would allow us to dynamically determine what sorts of optimal spot instances to use based on our workloads without incurring any negative effects on our uptime SLAs [service-level agreements] when Amazon revokes those instances. They came through.”

“Manually configuring all that, keeping that up to date, having all the fallback scenarios set up and up to date, is extremely complicated to do on your own. It's begging for an automated solution that can monitor the actual spot market and your instances and determine what the optimal reallocation would be,” Weiler said.

Cast AI is currently adding new features for observability and cost-reporting, but Gil sees an opportunity to even further reduce other areas of customers’ cloud bills.

“We’re just scratching the surface,” he said.


Judge Zia Faruqui is trying to teach you crypto, one ‘SNL’ reference at a time

His decisions on major cryptocurrency cases have quoted "The Big Lebowski," "SNL," and "Dr. Strangelove." That’s because he wants you — yes, you — to read them.

The ways Zia Faruqui (right) has weighed on cases that have come before him can give lawyers clues as to what legal frameworks will pass muster.

Photo: Carolyn Van Houten/The Washington Post via Getty Images

“Cryptocurrency and related software analytics tools are ‘The wave of the future, Dude. One hundred percent electronic.’”

That’s not a quote from "The Big Lebowski" — at least, not directly. It’s a quote from a Washington, D.C., district court memorandum opinion on the role cryptocurrency analytics tools can play in government investigations. The author is Magistrate Judge Zia Faruqui.

Keep ReadingShow less
Veronica Irwin

Veronica Irwin (@vronirwin) is a San Francisco-based reporter at Protocol covering fintech. Previously she was at the San Francisco Examiner, covering tech from a hyper-local angle. Before that, her byline was featured in SF Weekly, The Nation, Techworker, Ms. Magazine and The Frisc.

The financial technology transformation is driving competition, creating consumer choice, and shaping the future of finance. Hear from seven fintech leaders who are reshaping the future of finance, and join the inaugural Financial Technology Association Fintech Summit to learn more.

Keep ReadingShow less
The Financial Technology Association (FTA) represents industry leaders shaping the future of finance. We champion the power of technology-centered financial services and advocate for the modernization of financial regulation to support inclusion and responsible innovation.

AWS CEO: The cloud isn’t just about technology

As AWS preps for its annual re:Invent conference, Adam Selipsky talks product strategy, support for hybrid environments, and the value of the cloud in uncertain economic times.

Photo: Noah Berger/Getty Images for Amazon Web Services

AWS is gearing up for re:Invent, its annual cloud computing conference where announcements this year are expected to focus on its end-to-end data strategy and delivering new industry-specific services.

It will be the second re:Invent with CEO Adam Selipsky as leader of the industry’s largest cloud provider after his return last year to AWS from data visualization company Tableau Software.

Keep ReadingShow less
Donna Goodison

Donna Goodison (@dgoodison) is Protocol's senior reporter focusing on enterprise infrastructure technology, from the 'Big 3' cloud computing providers to data centers. She previously covered the public cloud at CRN after 15 years as a business reporter for the Boston Herald. Based in Massachusetts, she also has worked as a Boston Globe freelancer, business reporter at the Boston Business Journal and real estate reporter at Banker & Tradesman after toiling at weekly newspapers.

Image: Protocol

We launched Protocol in February 2020 to cover the evolving power center of tech. It is with deep sadness that just under three years later, we are winding down the publication.

As of today, we will not publish any more stories. All of our newsletters, apart from our flagship, Source Code, will no longer be sent. Source Code will be published and sent for the next few weeks, but it will also close down in December.

Keep ReadingShow less
Bennett Richardson

Bennett Richardson ( @bennettrich) is the president of Protocol. Prior to joining Protocol in 2019, Bennett was executive director of global strategic partnerships at POLITICO, where he led strategic growth efforts including POLITICO's European expansion in Brussels and POLITICO's creative agency POLITICO Focus during his six years with the company. Prior to POLITICO, Bennett was co-founder and CMO of Hinge, the mobile dating company recently acquired by Match Group. Bennett began his career in digital and social brand marketing working with major brands across tech, energy, and health care at leading marketing and communications agencies including Edelman and GMMB. Bennett is originally from Portland, Maine, and received his bachelor's degree from Colgate University.


Why large enterprises struggle to find suitable platforms for MLops

As companies expand their use of AI beyond running just a few machine learning models, and as larger enterprises go from deploying hundreds of models to thousands and even millions of models, ML practitioners say that they have yet to find what they need from prepackaged MLops systems.

As companies expand their use of AI beyond running just a few machine learning models, ML practitioners say that they have yet to find what they need from prepackaged MLops systems.

Photo: artpartner-images via Getty Images

On any given day, Lily AI runs hundreds of machine learning models using computer vision and natural language processing that are customized for its retail and ecommerce clients to make website product recommendations, forecast demand, and plan merchandising. But this spring when the company was in the market for a machine learning operations platform to manage its expanding model roster, it wasn’t easy to find a suitable off-the-shelf system that could handle such a large number of models in deployment while also meeting other criteria.

Some MLops platforms are not well-suited for maintaining even more than 10 machine learning models when it comes to keeping track of data, navigating their user interfaces, or reporting capabilities, Matthew Nokleby, machine learning manager for Lily AI’s product intelligence team, told Protocol earlier this year. “The duct tape starts to show,” he said.

Keep ReadingShow less
Kate Kaye

Kate Kaye is an award-winning multimedia reporter digging deep and telling print, digital and audio stories. She covers AI and data for Protocol. Her reporting on AI and tech ethics issues has been published in OneZero, Fast Company, MIT Technology Review, CityLab, Ad Age and Digiday and heard on NPR. Kate is the creator of RedTailMedia.org and is the author of "Campaign '08: A Turning Point for Digital Media," a book about how the 2008 presidential campaigns used digital media and data.

Latest Stories