Enterprise

Chiplets helped save AMD. They might also help save Moore’s law and head off an energy crisis.

To make chips faster, designers used to make them bigger — which is getting harder. To make better chips, the industry is turning to “chiplets.”

Chiplets

Chiplet-making is likely to become a dominant form of chip design in the coming years.

Illustration: Christopher T. Fong/Protocol

In 2015, CEO Lisa Su had only been the top boss at boom-and-bust chip company AMD for a few months. The business was trying to turn around its fortunes after its painful decision in 2009 to exit the manufacturing business and had embarked on an ambitious plan to re-enter the server chip market, which had been dominated by Intel for years.

But executives at AMD came to the conclusion that it didn’t have the resources to replicate Intel’s wide range of server chip designs and compete head-to-head across all those categories. It would be too expensive and difficult for the much smaller rival. And if it copied Intel, nothing about the new line of server chips would stand out either.

“We had one bullet to shoot for chip design,” AMD SVP Samuel Naffziger said about the company’s plans at the time.

So engineers at AMD looked to the past. Instead of trying to pack a larger number of features onto a single big piece of silicon, known as a “die,” they opted to break up their flagship chip into four separate parts and stitch them together.

This approach is called “chiplets,” and it’s likely to become a dominant form of chip design in the coming years.

“These small die were a huge enabler for us,” Naffziger said. “I view this as one of the greatest engineering achievements in the industry and in recent memory because it solves so many problems at once.”

AMD invented chiplets out of necessity, but by breaking up a chip into smaller pieces, it reduced the manufacturing costs by 40%. That had two consequences: First, it let AMD make a full suite of server chips where it could add and remove chiplets as necessary, to create several performance options and target different server chip price buckets. And, by moving to chiplets, AMD could reuse two of the server chiplets and design something less costly that worked for desktops too, the company’s most profitable segment at the time.

The plan helped save AMD — revenue grew to $16.4 billion last year from $4 billion in 2015 — and it might help save Moore’s law.

What AMD accomplished years ago is now on its way to become the industry norm. Intel’s plans include products with chiplets, and others in the industry are coalescing around a standard that will one day allow chipmakers to mix and match silicon from different vendors inside a single package.

The new chiplet-based designs are a nice-to-have at the moment, but they will quickly become a necessity, experts told Protocol.

The world produces and crunches data at a rapidly rising rate, and without the tech that underpins chiplets, it will become too expensive and difficult to continue to deliver the jump in computing horsepower that software developers expect every year with traditional processor designs. And in the longer run, those older designs will consume too much power to be economically viable.

“We're going to be locked into a situation where you're buying the same boxes that have the same performance, same power consumption,” TechInsights' chip economist Dan Hutcheson said. “And that means to scale them you either slow down the growth of the internet and the data or you have to build more data centers and more power plants to feed them.”

Moored in the past

One of the fascinating aspects of the chiplet concept is that it dates back to the seminal paper Gordon Moore wrote in 1965 that loosely set the ground rules of the industry for the next half-century. Those observations, known as Moore’s law, predicted that chips would get faster and cheaper every two years, as the number of transistors chip designers could fit on a chip doubled at the same pace.

But in that same paper, Moore described a world in which the economics of breaking up a single die into smaller pieces would someday make sense. Mixing and matching components would give system designers more flexibility and potentially boost performance, among other benefits.

“It may prove to be more economical to build large systems out of smaller functions, which are separately packaged and interconnected,” Moore wrote. “The availability of large functions, combined with functional design and construction, should allow the manufacturer of large systems to design and construct a considerable variety of equipment both rapidly and economically.”

It makes sense that Moore would suggest that: IBM was already building systems that included the chiplet concept as early as 1964 — at the time, it was the only way to achieve the necessary amount of computing horsepower. Companies such as IBM continued down that course for decades, and have applied the loose idea of chiplets to the most complex and expensive systems, such as supercomputers and mainframes.

But the chiplets of the past were complex and expensive, which led semiconductor companies to squeeze more discrete features such as graphics or memory onto a single piece of silicon: the system-on-chip (SoC) found in smartphones, some server processors and Apple’s latest designs for its laptops and desktops.

“In other words, when we mean chiplet, we mean taking up an SoC and splitting it up into its component functions,” IBM’s hybrid cloud technologist Rama Divakaruni said. “Now that we are going back to using chiplets, we are a lot smarter — with all the innovation we had with the history of 50 years of silicon, we will bring that into the packaging space. So that’s the excitement.”

Big dies, big problems

In the past, when chip designers added more components onto a single monolithic piece of silicon called a die — the term comes from “dicing” a silicon wafer into chip-sized pieces — that meant that chips had to get larger. It’s intuitive: Larger surfaces can theoretically fit more features, especially since the features themselves shrink every time manufacturers introduce better tech.

Bigger dies therefore translated to more computing horsepower. For server chips, it’s especially noticeable, since they tend to run five times the size of a chip found in a typical PC, according to research from Jefferies.

“Now things are getting so fast, the performance is so high, that you're being forced to move [more chips] into the package,” Hutcheson said. “Several technical and economic aspects of chipmaking have conspired to push the industry toward chiplets.”

But big die sizes create big problems. One fundamental issue is that it’s currently impossible to print a chip larger than the blueprint used in the photolithography stage of chip manufacturing, called a photomask. Because of technical limits, the beam of light shining through the photomask to reproduce the blueprint onto the silicon wafer cannot print chips larger than about 850 square millimeters.

Large dies are also much more prone to defects, which in turn reduces the number of good chips that can be cut from each wafer and makes each working chip cost more. At the same time, there are concerns that transistors are getting more expensive as they shrink — coupled with that fact that certain key features on modern chips don’t shrink well — which means it doesn’t make sense to use the most advanced process nodes for wireless communications chips, for example.

“When AMD tried to take the [2017] Naples design, and shrink it from 14 nanometer to seven, just pure lithographic scaling, they found it wasn't gonna work,” Columbia Threadneedle analyst Dave Egan told Protocol. “At the first pass design, they were only able to basically shrink about a half of it.”

No chiplets from Nvidia

Nvidia ran up against the photomask issue, also known as the reticle limit, over five years ago, according to Nvidia Vice President Ian Buck. But the company hasn’t opted for the chiplet approach as of yet.

Part of the reason is that the graphics chips Nvidia is known for operate fundamentally differently than the CPUs from Intel and AMD. Nvidia’s chips use thousands of computing cores to perform lots of relatively simple calculations at once, which makes them well-suited for graphics or for AI-accelerated computing in data centers.

“The GPU is a very different beast,” Buck said. “In the graphics space, it’s not individual cores when presented to a developer; they’re given a scene description and they have to distribute the work and render it.”

To confront the fundamental limit of the size of the photomask without adopting the chiplet approach, Nvidia has focused its efforts around building what it calls super chips. The company has developed its own interconnect technology called NVLink to attach multiple graphics chips and servers together. To Buck, the ultimate expression of that strategy up until this point is the company’s forthcoming Grace Hopper product, which fuses an Arm-based CPU to one of Nvidia’s server GPUs.

Nvidia does make smaller chips for enterprise applications such as AI inference and production. But, for the flagship chips designed for AI training, the company has found that its customers require the maximum amount of compute possible and value the largest processors the company makes.

“This growth greatly simplifies the programming model, but also, for AI, allows you to treat the CPU’s memory as an extension of the GPU’s memory,” Buck said. “They're basically two super chips put together.”

Mix and match

AMD may have been the first major chipmaker to mass produce and sell processors based around chiplets, but other than Nvidia and a handful of others, the rest of the industry is moving in the same direction. Several of the largest chipmakers, such as AMD, Intel, Samsung and cloud service providers, support a new standard for connecting chiplets made by different companies. Called “universal chiplet interconnect express,” the approach could reshape new semiconductor designs.

“Because of the new UCI Express, the whole industry is centering around the term chiplet,” Hutcheson said. “The real significance between today versus what we did before is that before a company had to do it all themselves — it’s not like you could buy this chip and this chip, and make my own electronic device.”

In an ideal world, the UCIE standard would let chipmakers mix and match chips that use different manufacturing process technologies, and made by different companies into products built inside a single package. That means taking memory made by Micron, a CPU core produced by AMD and a wireless modem made by Qualcomm and fitting them together — which could greatly improve performance, while saving an enormous amount of power.

“To allow for a heterogeneous system to be constructed on a package, you want on-package memory because of higher memory bandwidth,” Intel senior fellow Debendra Das Sharma said. “There are certain acceleration functions that can benefit by being on the same package, and also having a much lower latency and low power way of accessing all the components in the system, including memory.”

Mixing and matching chiplets would also enable AMD and Intel to create custom products for large customers that have specific needs. Accelerated computing, which is commonly deployed to tackle AI compute tasks, is low-hanging fruit to Das Sharma. Should one customer need a chip for a specific type of AI, Intel could substitute a general purpose accelerator for something more specialized.

Universally interconnecting chiplets isn’t a reality yet. According to several industry watchers, it’s unlikely to materialize for several years as the standard gets hammered out. The second version — which could arrive in roughly 2025 or so — is more likely to herald the type of hot swapping that Das Sharma discussed.

But whether the industry comes together in 2025 or 2026, chiplets are the future of processors — at least for the moment. Data centers consume a massive amount of the world’s energy, and that consumption will only increase as Mark Zuckerberg attempts to manifest his version of the metaverse, and, in the nearer term, more aspects of our lives turn digital.

“When you move these electrons down this pipe — simply going off chip, the power needed to do it is about 10,000X difference,” Hutcheson said. “To move a signal from one chip to another chip in another package, it’s like a 100,000X difference.”

Fintech

Judge Zia Faruqui is trying to teach you crypto, one ‘SNL’ reference at a time

His decisions on major cryptocurrency cases have quoted "The Big Lebowski," "SNL," and "Dr. Strangelove." That’s because he wants you — yes, you — to read them.

The ways Zia Faruqui (right) has weighed on cases that have come before him can give lawyers clues as to what legal frameworks will pass muster.

Photo: Carolyn Van Houten/The Washington Post via Getty Images

“Cryptocurrency and related software analytics tools are ‘The wave of the future, Dude. One hundred percent electronic.’”

That’s not a quote from "The Big Lebowski" — at least, not directly. It’s a quote from a Washington, D.C., district court memorandum opinion on the role cryptocurrency analytics tools can play in government investigations. The author is Magistrate Judge Zia Faruqui.

Keep ReadingShow less
Veronica Irwin

Veronica Irwin (@vronirwin) is a San Francisco-based reporter at Protocol covering fintech. Previously she was at the San Francisco Examiner, covering tech from a hyper-local angle. Before that, her byline was featured in SF Weekly, The Nation, Techworker, Ms. Magazine and The Frisc.

The financial technology transformation is driving competition, creating consumer choice, and shaping the future of finance. Hear from seven fintech leaders who are reshaping the future of finance, and join the inaugural Financial Technology Association Fintech Summit to learn more.

Keep ReadingShow less
FTA
The Financial Technology Association (FTA) represents industry leaders shaping the future of finance. We champion the power of technology-centered financial services and advocate for the modernization of financial regulation to support inclusion and responsible innovation.
Enterprise

AWS CEO: The cloud isn’t just about technology

As AWS preps for its annual re:Invent conference, Adam Selipsky talks product strategy, support for hybrid environments, and the value of the cloud in uncertain economic times.

Photo: Noah Berger/Getty Images for Amazon Web Services

AWS is gearing up for re:Invent, its annual cloud computing conference where announcements this year are expected to focus on its end-to-end data strategy and delivering new industry-specific services.

It will be the second re:Invent with CEO Adam Selipsky as leader of the industry’s largest cloud provider after his return last year to AWS from data visualization company Tableau Software.

Keep ReadingShow less
Donna Goodison

Donna Goodison (@dgoodison) is Protocol's senior reporter focusing on enterprise infrastructure technology, from the 'Big 3' cloud computing providers to data centers. She previously covered the public cloud at CRN after 15 years as a business reporter for the Boston Herald. Based in Massachusetts, she also has worked as a Boston Globe freelancer, business reporter at the Boston Business Journal and real estate reporter at Banker & Tradesman after toiling at weekly newspapers.

Image: Protocol

We launched Protocol in February 2020 to cover the evolving power center of tech. It is with deep sadness that just under three years later, we are winding down the publication.

As of today, we will not publish any more stories. All of our newsletters, apart from our flagship, Source Code, will no longer be sent. Source Code will be published and sent for the next few weeks, but it will also close down in December.

Keep ReadingShow less
Bennett Richardson

Bennett Richardson ( @bennettrich) is the president of Protocol. Prior to joining Protocol in 2019, Bennett was executive director of global strategic partnerships at POLITICO, where he led strategic growth efforts including POLITICO's European expansion in Brussels and POLITICO's creative agency POLITICO Focus during his six years with the company. Prior to POLITICO, Bennett was co-founder and CMO of Hinge, the mobile dating company recently acquired by Match Group. Bennett began his career in digital and social brand marketing working with major brands across tech, energy, and health care at leading marketing and communications agencies including Edelman and GMMB. Bennett is originally from Portland, Maine, and received his bachelor's degree from Colgate University.

Enterprise

Why large enterprises struggle to find suitable platforms for MLops

As companies expand their use of AI beyond running just a few machine learning models, and as larger enterprises go from deploying hundreds of models to thousands and even millions of models, ML practitioners say that they have yet to find what they need from prepackaged MLops systems.

As companies expand their use of AI beyond running just a few machine learning models, ML practitioners say that they have yet to find what they need from prepackaged MLops systems.

Photo: artpartner-images via Getty Images

On any given day, Lily AI runs hundreds of machine learning models using computer vision and natural language processing that are customized for its retail and ecommerce clients to make website product recommendations, forecast demand, and plan merchandising. But this spring when the company was in the market for a machine learning operations platform to manage its expanding model roster, it wasn’t easy to find a suitable off-the-shelf system that could handle such a large number of models in deployment while also meeting other criteria.

Some MLops platforms are not well-suited for maintaining even more than 10 machine learning models when it comes to keeping track of data, navigating their user interfaces, or reporting capabilities, Matthew Nokleby, machine learning manager for Lily AI’s product intelligence team, told Protocol earlier this year. “The duct tape starts to show,” he said.

Keep ReadingShow less
Kate Kaye

Kate Kaye is an award-winning multimedia reporter digging deep and telling print, digital and audio stories. She covers AI and data for Protocol. Her reporting on AI and tech ethics issues has been published in OneZero, Fast Company, MIT Technology Review, CityLab, Ad Age and Digiday and heard on NPR. Kate is the creator of RedTailMedia.org and is the author of "Campaign '08: A Turning Point for Digital Media," a book about how the 2008 presidential campaigns used digital media and data.

Latest Stories
Bulletins