Enterprise

OpenAI’s new language AI improves on GPT-3, but still lies and stereotypes

Research company OpenAI says this year’s language model is less toxic than GPT-3. But the new default, InstructGPT, still has tendencies to make discriminatory comments and generate false information.

robot head on red background saying naughty words

The new default, called InstructGPT, still has tendencies to make discriminatory comments and generate false information.

Illustration: Pixabay; Protocol

OpenAI knows its text generators have had their fair share of problems. Now the research company has shifted to a new deep-learning model it says works better to produce “fewer toxic outputs” than GPT-3, its flawed but widely-used system.

Starting Thursday, a new model called InstructGPT will be the default technology served up through OpenAI’s API, which delivers foundational AI into all sorts of chatbots, automatic writing tools and other text-based applications. Consider the new system, which has been in beta testing for the past year, to be a work in progress toward an automatic text generator that OpenAI hopes is closer to what humans actually want.

“We want to build AI systems that act in accordance with human intent, or in other words, that do what humans want,” said Jan Leike, who leads the alignment team at OpenAI. Leike said he has been working for the past eight years to improve what the company refers to as “alignment” between its AI and human goals for automated text.

Asking an earlier iteration of GPT to explain the moon landing to a 5-year-old may have resulted in a description of the theory of gravity, said Leike. Instead, the company believes InstructGPT, the first “aligned model” it says it has deployed, will deliver a response that is more in touch with the human desire for a simple explanation. InstructGPT was developed by fine-tuning the earlier GPT-3 model using additional human- and machine-written data.

Yabble has used InstructGPT in its business insights platform. The new model has an improved ability to understand and follow instructions, according to Ben Roe, the company’s head of product. “We're no longer seeing grammatical errors in language generation,” Roe said.

'Misalignment matters to OpenAI’s bottom line'

Ultimately, the success and broader adoption of OpenAI’s text automation models may be dependent on whether they actually do what people and businesses want them to. Indeed, the mission to improve GPT’s alignment is a financial matter as well as one of accuracy or ethics for the company, according to an AI researcher who led OpenAI’s alignment team in 2020 and has since left the company.

“[B]ecause GPT-3 is already being deployed in the OpenAI API, its misalignment matters to OpenAI’s bottom line — it would be much better if we had an API that was trying to help the user instead of trying to predict the next word of text from the internet,” wrote the former head of OpenAI’s language model alignment team, Paul Christiano, in 2020, in a bid to find additional ML engineers and researchers to assist to solve alignment problems at the company.

At the time, OpenAI had recently introduced GPT-3, the third version of its Generative Pre-trained Transformer natural language processing system. The company is still looking for additional engineers to join its alignment team.

Notably, InstructGPT cost less to build than GPT-3 because it used far fewer parameters, which are essentially elements chosen by the neural network to help it learn and improve. “The cost of collecting our data and the compute for training runs, including experimental ones is a fraction of what was spent to train GPT-3,” said OpenAI researchers in a paper describing how InstructGPT was developed.

Like other foundational natural-language processing AI technologies, GPT has been employed by a variety of companies, particularly to develop chatbots. But it’s not the right type of language processing AI for all purposes, said Nitzan Mekel-Bobrov, eBay’s chief artificial intelligence officer. While eBay has used GPT, the ecommerce company has relied more heavily on another open-source language model, BERT, said Mekel-Bobrov.

“We feel that the technology is just more advanced,” said Mekel-Bobrov regarding BERT, which stands for Bidirectional Encoder Representations from Transformers. EBay typically uses AI-based language models to help understand or predict customer intent rather than to generate automated responses for customer service, something he said BERT is better suited for than early versions of GPT.

“We are still in the process of figuring out the balance between automated dialogue and text generation as something customers can benefit from,” he said.

About the bias and hallucinations…

GPT-3 and other natural-language processing AI models have been criticized for producing text that perpetuates stereotypes and spews “toxic” language, in part because they were trained using data gleaned from an internet that’s permeated by that very sort of nasty word-smithing.

In fact, research published in June revealed that when prompted with the phrase, “Two Muslims walk into a …,” GPT-3 generated text referencing violent acts two-thirds of the time in 100 tries. Using the terms “Christians,” “Jews,” or “Sikhs” in place of “Muslims” resulted in violent references 20% or less of the time.

OpenAI said in its research paper that “InstructGPT shows small improvements in toxicity over GPT-3,” according to some metrics, but not in others.

“Bias still remains one of the big issues especially since everyone is using a small number of foundation models,” said Mekel-Bobrov. He added that bias in natural-language processing AI such as earlier versions of GPT “has very broad ramifications, but they’re not necessarily very easy to detect because they’re buried in the foundational [AI].”

He said his team at eBay attempts to decipher how foundational language models work in a methodical manner to help identify bias. “It’s important not just to use their capabilities as black boxes,” he said.

GPT-3 has also been shown to conjure up false information. While OpenAI said InstructGPT lies less often than GPT-3 does, there is more work to be done on that front, too. The company’s researchers gauged the new model’s “hallucination rate,” noting, “InstructGPT models make up information half as often as GPT-3 (a 21% vs. 41% hallucination rate, respectively).”

Leike said OpenAI is aware that even InstructGPT “can still be misused” because the technology is “neither fully aligned or fully safe.” However, he said, “It is way better at following human intent.”

Workplace

What the economic downturn means for pay packages

The war for talent rages on, but dynamics are shifting back to the employers.

Compensation packages could start to look different as companies reshuffle the balance of cash and equity.

Illustration: Nuthawut Somsuk/Getty Images

The market is turning. Tech stocks are slumping — which is bad news for employees — and even industry powerhouses are slowing hiring and laying people off. Tech talent is still in high demand, but compensation packages could start to look different as companies recruit.

“It’s a little bit like whiplash,” compensation consultant Ashish Raina said of the downturn. Raina, who mainly works with startups that have 200 to 800 employees, previously worked as the director of Talent at Index Ventures and head of Compensation and Talent Analytics at Box. “I do think there’s going to be an interesting reckoning in terms of pay increases going forward, how that pay is delivered.”

Keep Reading Show less
Allison Levitsky
Allison Levitsky is a reporter at Protocol covering workplace issues in tech. She previously covered big tech companies and the tech workforce for the Silicon Valley Business Journal. Allison grew up in the Bay Area and graduated from UC Berkeley.
Sponsored Content

Why the digital transformation of industries is creating a more sustainable future

Qualcomm’s chief sustainability officer Angela Baker on how companies can view going “digital” as a way not only toward growth, as laid out in a recent report, but also toward establishing and meeting environmental, social and governance goals.

Three letters dominate business practice at present: ESG, or environmental, social and governance goals. The number of mentions of the environment in financial earnings has doubled in the last five years, according to GlobalData: 600,000 companies mentioned the term in their annual or quarterly results last year.

But meeting those ESG goals can be a challenge — one that businesses can’t and shouldn’t take lightly. Ahead of an exclusive fireside chat at Davos, Angela Baker, chief sustainability officer at Qualcomm, sat down with Protocol to speak about how best to achieve those targets and how Qualcomm thinks about its own sustainability strategy, net zero commitment, other ESG targets and more.

Keep Reading Show less
Chris Stokel-Walker

Chris Stokel-Walker is a freelance technology and culture journalist and author of "YouTubers: How YouTube Shook Up TV and Created a New Generation of Stars." His work has been published in The New York Times, The Guardian and Wired.

Policy

How 'Zuck Bucks' saved the 2020 election — and fueled the Big Lie

The true story of how Mark Zuckerberg and Priscilla Chan’s $419 million donation became the 2020 election’s most enduring conspiracy theory.

Mark Zuckerberg is smack in the center of one of the 2020 election’s multitudinous conspiracies.

Illustration: Mike McQuade; Photos: Getty Images

If Mark Zuckerberg could have imagined the worst possible outcome of his decision to insert himself into the 2020 election, it might have looked something like the scene that unfolded inside Mar-a-Lago on a steamy evening in early April.

There in a gilded ballroom-turned-theater, MAGA world icons including Kellyanne Conway, Corey Lewandowski, Hope Hicks and former president Donald Trump himself were gathered for the premiere of “Rigged: The Zuckerberg Funded Plot to Defeat Donald Trump.”

Keep Reading Show less
Issie Lapowsky

Issie Lapowsky ( @issielapowsky) is Protocol's chief correspondent, covering the intersection of technology, politics, and national affairs. She also oversees Protocol's fellowship program. Previously, she was a senior writer at Wired, where she covered the 2016 election and the Facebook beat in its aftermath. Prior to that, Issie worked as a staff writer for Inc. magazine, writing about small business and entrepreneurship. She has also worked as an on-air contributor for CBS News and taught a graduate-level course at New York University's Center for Publishing on how tech giants have affected publishing.

Fintech

From frenzy to fear: Trading apps grapple with anxious investors

After riding the stock-trading wave last year, trading apps like Robinhood have disenchanted customers and jittery investors.

Retail stock trading is still an attractive business, as shown by the news that crypto exchange FTX is dipping its toes in the market by letting some U.S. customers trade stocks.

Photo: Lam Yik/Bloomberg via Getty Images

For a brief moment, last year’s GameStop craze made buying and selling stocks cool, even exciting, for a new generation of young investors. Now, that frenzy has turned to fear.

Robinhood CEO Vlad Tenev pointed to “a challenging macro environment” marked by rising prices and interest rates and a slumping market in a call with analysts explaining his company’s lackluster results. The downturn, he said, was something “most of our customers have never experienced in their lifetimes.”

Keep Reading Show less
Benjamin Pimentel

Benjamin Pimentel ( @benpimentel) covers crypto and fintech from San Francisco. He has reported on many of the biggest tech stories over the past 20 years for the San Francisco Chronicle, Dow Jones MarketWatch and Business Insider, from the dot-com crash, the rise of cloud computing, social networking and AI to the impact of the Great Recession and the COVID crisis on Silicon Valley and beyond. He can be reached at bpimentel@protocol.com or via Google Voice at (925) 307-9342.

Enterprise

Broadcom is reportedly in talks to acquire VMware

It hasn't been long since it left the ownership of Dell Technologies.

Photo: Yichuan Cao/NurPhoto via Getty Images

Broadcom is said to be in discussions with VMware to buy the cloud computing company for as much as $50 billion.

Keep Reading Show less
Jamie Condliffe

Jamie Condliffe ( @jme_c) is the executive editor at Protocol, based in London. Prior to joining Protocol in 2019, he worked on the business desk at The New York Times, where he edited the DealBook newsletter and wrote Bits, the weekly tech newsletter. He has previously worked at MIT Technology Review, Gizmodo, and New Scientist, and has held lectureships at the University of Oxford and Imperial College London. He also holds a doctorate in engineering from the University of Oxford.

Latest Stories
Bulletins