Researchers push to make bulky AI work in your phone and personal assistant

Chipmakers like Nvidia and researchers from Notre Dame want to make huge transformers like large natural-language-process models speedier, more nimble and more energy efficient.

Phones connected on a gray background.

"We want it smaller and smaller, and it has to be more energy efficient.”

Illustration: Christopher T. Fong/Protocol

Transformer networks, colloquially known to deep-learning practitioners and computer engineers as “transformers,” are all the rage in AI. Over the last few years, these models, known for their massive size, large amount of data inputs, big scale of parameters — and, by extension, high carbon footprint and cost — have grown in favor over other types of neural network architectures.

Some transformers, particularly some open-source, large natural-language-processing transformer models, even have names that are recognizable to people outside AI, such as GPT-3 and BERT. They’re used across audio-, video- and computer-vision-related tasks, drug discovery and more.

Now chipmakers and researchers want to make them speedier and more nimble.

“It’s interesting how fast technology for neural networks changes. Four years ago, everybody was using these recurrent neural networks for these language models and then the attention paper was introduced, and all of a sudden, everybody is using transformers,” said Bill Dally, chief scientist at Nvidia during an AI conference last week held by Stanford’s HAI. Dally was referring to an influential 2017 Google research paper presenting an innovative architecture forming the backbone of transformer networks that is reliant on “attention mechanisms” or “self-attention,” a new way to process the data inputs and outputs of models.

“The world pivoted in a matter of a few months and everything changed,” Dally said. To meet the growing interest in transformer use, in March the AI chip giant introduced its Hopper h100 transformer engine to streamline transformer model workloads.

Designing transformer tech for the edge

But some researchers are pushing for even more. There’s talk not only of making compute- and energy-hungry transformers more efficient, but of eventually upgrading their design so they can process fresh data in edge devices without having to make the round trip to process the data in the cloud.

A group of researchers from Notre Dame and China’s Zhejiang University presented a way to reduce memory-processing bottlenecks and computational and energy consumption requirements in an April paper. The “iMTransformer” approach is a transformer accelerator, which works to decrease memory transfer needs by computing in-memory, and reduces the number of operations required by caching reusable model parameters.

Right now the trend is to bulk up transformers so the models get large enough to take on increasingly complex tasks, said Ana Franchesca Laguna, a computer science and engineering PhD at Notre Dame. When it comes to large natural-language-processing models, she said, “It’s the difference between a sentence or a paragraph and a book.” But, she added, “The bigger the transformers are, your energy footprint also increases.”

Using an accelerator like the iMTransformer could help to pare down that footprint, and, in the future, create transformer models that could ingest, process and learn from new data in edge devices. “Having the model closer to you would be really helpful. You could have it in your phone, for example, so it would be more accessible for edge devices,” she said.

That means IoT devices such as Amazon’s Alexa, Google Home or factory equipment maintenance sensors could process voice or other data in the device rather than having to send it to the cloud, which takes more time and more compute power, and could expose the data to possible privacy breaches, Laguna said.

IBM also introduced an AI accelerator called RAPID last year. “Scaling the performance of AI accelerators across generations is pivotal to their success in commercial deployments,” wrote the company’s researchers in a paper. “The intrinsic error-resilient nature of AI workloads present a unique opportunity for performance/energy improvement through precision scaling.”

Farah Papaioannou, co-founder and president at Edgeworx, said she thinks of the edge as anything outside the cloud. “What we’re seeing of our customers, they’re deploying these AI models you want to train and update on a regular basis, so having the ability to manage that capability and update that on a much faster basis [is definitely important],” she said during a 2020 Protocol event about computing at the edge.

Wanted: custom chips

Laguna uses a work-from-home analogy when thinking of the benefits of processing data for AI models at the edge.

“[Instead of] commuting from your home to the office, you actually work from home. It’s all in the same place, so it saves a lot of energy,” she said. She said she hopes research like hers will enable people to build and use transformers in a more cost- and energy-efficient way. “We want it on our edge devices. We want it smaller and smaller, and it has to be more energy efficient.”

Laguna and the other researchers she worked with tested their accelerator approach using smaller chips, and then extrapolated their findings to estimate how the process would work at a larger scale. However, Laguna said that turning the small-scale project into a reality at a larger scale will require customized, larger chips.

Ultimately, she hopes it spurs investment. A goal of the project, she said, “is to convince people that this is worthy of investing in so we can create chips so we can create these types of networks.”

That investor interest might just be there. AI is spurring increases in investments in chips for specific use cases. According to data from PitchBook, global sales of AI chips rose 60% last year to $35.9 billion compared to 2020. Around half of that total came from specialized AI chips in mobile phones.

Systems designed to operate at the edge with less memory rather than in the cloud could facilitate AI-based applications that can respond to new information in real time, said Jarno Kartela, global head of AI Advisory at consultancy Thoughtworks.

“What if you can build systems that by themselves learn in real time and learn by interaction?” he said. “Those systems, you don’t need to run them on cloud environments only with massive infrastructure — you can run them virtually anywhere.”


To clear the FTC, Microsoft’s Activision deal might require compromise

The FTC is in the process of reviewing the biggest-ever gaming acquisition. Here’s how it could change the Xbox business.

Will the Microsoft acquisition of Activision get through the FTC?

Image: Microsoft; Protocol

Microsoft’s planned acquisition of Activision Blizzard is the largest-ever deal in the video game market by a mile. With a sale price of $68.7 billion, the deal is nearly 450% larger than Grand Theft Auto publisher Take-Two Interactive’s acquisition of Zynga in January, the next-largest game acquisition ever recorded.

The eye-popping price underlines the scale and scope of Microsoft’s ambitions for its gaming business: If the deal is approved, Microsoft would own — alongside its current major properties, such as Halo and Minecraft — Warcraft, Overwatch and Call of Duty, to name just a few. In turn, the deal has invited a rare level of scrutiny and attention from lawmakers and policy professionals now turning their sights on an industry that’s flown under the regulatory radar for the last several decades of its existence.

Keep Reading Show less
Nick Statt

Nick Statt is Protocol's video game reporter. Prior to joining Protocol, he was news editor at The Verge covering the gaming industry, mobile apps and antitrust out of San Francisco, in addition to managing coverage of Silicon Valley tech giants and startups. He now resides in Rochester, New York, home of the garbage plate and, completely coincidentally, the World Video Game Hall of Fame. He can be reached at nstatt@protocol.com.

Sponsored Content

Why the digital transformation of industries is creating a more sustainable future

Qualcomm’s chief sustainability officer Angela Baker on how companies can view going “digital” as a way not only toward growth, as laid out in a recent report, but also toward establishing and meeting environmental, social and governance goals.

Three letters dominate business practice at present: ESG, or environmental, social and governance goals. The number of mentions of the environment in financial earnings has doubled in the last five years, according to GlobalData: 600,000 companies mentioned the term in their annual or quarterly results last year.

But meeting those ESG goals can be a challenge — one that businesses can’t and shouldn’t take lightly. Ahead of an exclusive fireside chat at Davos, Angela Baker, chief sustainability officer at Qualcomm, sat down with Protocol to speak about how best to achieve those targets and how Qualcomm thinks about its own sustainability strategy, net zero commitment, other ESG targets and more.

Keep Reading Show less
Chris Stokel-Walker

Chris Stokel-Walker is a freelance technology and culture journalist and author of "YouTubers: How YouTube Shook Up TV and Created a New Generation of Stars." His work has been published in The New York Times, The Guardian and Wired.


Okta CEO: 'We should have done a better job' with the Lapsus$ breach

In an interview with Protocol, Okta CEO Todd McKinnon said the cybersecurity firm could’ve done a lot of things better after the Lapsus$ breach of a third-party support provider earlier this year.

From talking to hundreds of customers, “I've had a good sense of the sentiment and the frustrations,” McKinnon said.

Photo: David Paul Morris via Getty Images

Okta co-founder and CEO Todd McKinnon agrees with you: Disclosing a breach that impacts customer data should not take months.

“If that happens in January, customers can't be finding out about it in March,” McKinnon said in an interview with Protocol.

Keep Reading Show less
Kyle Alspach

Kyle Alspach ( @KyleAlspach) is a senior reporter at Protocol, focused on cybersecurity. He has covered the tech industry since 2010 for outlets including VentureBeat, CRN and the Boston Globe. He lives in Portland, Oregon, and can be reached at kalspach@procotol.com.


Ethereum's co-founder thinks the blockchain can fix social media

But before the blockchain can fix social media, someone has to fix the blockchain. Frank McCourt, who’s put serious money behind his vision of a decentralized social media future, thinks Gavin Wood may be the key.

Gavin Wood, co-founder of Ethereum and creator of Polkadot, is helping Frank McCourt's decentralized social media initiative.

Photo: Jason Crowley

Frank McCourt, the billionaire mogul who is donating $100 million to help build decentralized alternatives to the social media giants, has picked a partner to make the blockchain work at Facebook scale: Ethereum co-founder Gavin Wood.

McCourt’s Project Liberty will work with the Web3 Foundation’s Polkadot project, it said Tuesday. Wood launched Polkadot in 2020 after leaving Ethereum. Project Liberty has a technical proposal to allow users to retain their data on a blockchain as they move among future social media services. Wood’s involvement is to give the idea a shot at actually working at the size and speed of a popular social network.

Keep Reading Show less
Ben Brody

Ben Brody (@ BenBrodyDC) is a senior reporter at Protocol focusing on how Congress, courts and agencies affect the online world we live in. He formerly covered tech policy and lobbying (including antitrust, Section 230 and privacy) at Bloomberg News, where he previously reported on the influence industry, government ethics and the 2016 presidential election. Before that, Ben covered business news at CNNMoney and AdAge, and all manner of stories in and around New York. He still loves appearing on the New York news radio he grew up with.


Gensler: Bitcoin may be a commodity

The SEC has been vague about crypto. But Gensler said bitcoin is a commodity, “maybe.” It’s the clearest glimpse of his views on digital assets yet.

“Bitcoin — maybe that’s a commodity token. That has a big market value, but that goes over there,” Gensler said, referring to another regulator, the CFTC.

Photoillustration: Al Drago/Bloomberg via Getty Images; Protocol

SEC Chair Gary Gensler has long argued that many cryptocurrencies are subject to regulation as securities.

But he recently clarified that this view wouldn’t apply to the best-known cryptocurrency, bitcoin.

Keep Reading Show less
Benjamin Pimentel

Benjamin Pimentel ( @benpimentel) covers crypto and fintech from San Francisco. He has reported on many of the biggest tech stories over the past 20 years for the San Francisco Chronicle, Dow Jones MarketWatch and Business Insider, from the dot-com crash, the rise of cloud computing, social networking and AI to the impact of the Great Recession and the COVID crisis on Silicon Valley and beyond. He can be reached at bpimentel@protocol.com or via Google Voice at (925) 307-9342.

Latest Stories