Microsoft just built one of the world’s most powerful supercomputers — in Azure
The device will crunch huge AI problems exclusively for OpenAI and is the latest sign of supercomputing facilities moving out of labs and onto the cloud.
Microsoft has constructed a 285,000-processor supercomputer designed for machine learning applications as part of its Azure cloud infrastructure service. It should be one of the most powerful computers on the planet — and is the latest sign that supercomputing may slowly migrate to the cloud.
The supercomputer was developed for OpenAI, the organization working to build "safe artificial general intelligence," assuming such a thing is possible. It will be exclusively available to OpenAI as a cloud service for running the "massive distributed AI models" that it says will be needed to achieve artificial general intelligence, Microsoft announced Tuesday.
OpenAI has made some of the more notable AI advances in recent years, including building algorithms capable of powerful language processing and beating the world's best players at the video game Dota 2. But while mastering Dota 2 is difficult, it pales compared to building algorithms that understand the complexities of the real world, and OpenAI's plan is to use ever-greater data sets and computational power to make steps toward building an AGI.
OpenAI received a $1 billion investment from Microsoft last year, the bulk of which will be spent on computing power to achieve its goal. The terms of that deal require Microsoft to eventually become OpenAI's only source of computing.
The new machine, which Chief Technology Officer Kevin Scott called Microsoft's "first large-scale AI supercomputer," is the highlight of several announcements the company plans to make Tuesday during Microsoft Build, its annual developer conference. Held the last few years in downtown Seattle, Build, like many events, is being conducted entirely online this year and features a keynote address from CEO Satya Nadella on Tuesday morning.
The new system lives entirely within Azure, which is still subject to capacity constraints brought on by a surge of cloud computing activity caused by global stay-at-home orders during the COVID-19 pandemic. The new system for OpenAI was brought online in six months according to Scott — fast, given its claimed power, and curious, given that Microsoft could have really used that computing power to service existing customers.
Supercomputer rankings are taken very seriously among researchers and system builders, and Microsoft acknowledged that it has yet to submit the new Azure system to the Top500 organization, the official source of record on such matters. Still, with almost 300,000 CPUs, 10,000 GPUs and 400 Gbps of network connectivity to those GPUs, Microsoft estimates that its new system should rank fifth on the current list.
Artificial intelligence tools have been a key part of every cloud vendor's strategy over the last few years, in large part because artificial intelligence services require a lot of expensive computing power. This is a big step forward, but it's not clear whether Microsoft will allow other customers to access the supercomputer over time or whether it will be indefinitely reserved for OpenAI.
Given the size and unique requirements of supercomputers, most remain on-premises, managed by their owners. Cloud-based supercomputing is gaining steam, however; last year AWS devoted a significant portion of the opening keynote at its re:Invent conference to explain how it is building supercomputing capacity in its cloud, and Google has also talked up how high-performance computing customers can access supercomputing capacity in its cloud.
The new machine will be used to conduct "self-supervised learning," Scott said, taking very large datasets and training models for speech and image recognition that have the potential to produce outsized benefits compared to smaller datasets. "The bigger these models get, the more they can do," he said.
Other noteworthy announcements on the Build agenda this week include:
- Azure Synapse Link, a new database service that lets customers combine transactional processing and analytical processing in Azure.
- Microsoft Cloud for Healthcare, the company's "first industry-specific cloud offering" designed to help doctors and health care facilities provide better service and use advanced data analysis techniques.
- A more customizable Microsoft Teams, which allows customers to tweak their company's Teams setup with a default set of applications, and also build and deploy Microsoft Teams applications from Visual Studio and Visual Studio Code, Microsoft's popular developer environments.