Enterprise

OpenAI’s new language AI improves on GPT-3, but still lies and stereotypes

Research company OpenAI says this year’s language model is less toxic than GPT-3. But the new default, InstructGPT, still has tendencies to make discriminatory comments and generate false information.

robot head on red background saying naughty words

The new default, called InstructGPT, still has tendencies to make discriminatory comments and generate false information.

Illustration: Pixabay; Protocol

OpenAI knows its text generators have had their fair share of problems. Now the research company has shifted to a new deep-learning model it says works better to produce “fewer toxic outputs” than GPT-3, its flawed but widely-used system.

Starting Thursday, a new model called InstructGPT will be the default technology served up through OpenAI’s API, which delivers foundational AI into all sorts of chatbots, automatic writing tools and other text-based applications. Consider the new system, which has been in beta testing for the past year, to be a work in progress toward an automatic text generator that OpenAI hopes is closer to what humans actually want.

“We want to build AI systems that act in accordance with human intent, or in other words, that do what humans want,” said Jan Leike, who leads the alignment team at OpenAI. Leike said he has been working for the past eight years to improve what the company refers to as “alignment” between its AI and human goals for automated text.

Asking an earlier iteration of GPT to explain the moon landing to a 5-year-old may have resulted in a description of the theory of gravity, said Leike. Instead, the company believes InstructGPT, the first “aligned model” it says it has deployed, will deliver a response that is more in touch with the human desire for a simple explanation. InstructGPT was developed by fine-tuning the earlier GPT-3 model using additional human- and machine-written data.

Yabble has used InstructGPT in its business insights platform. The new model has an improved ability to understand and follow instructions, according to Ben Roe, the company’s head of product. “We're no longer seeing grammatical errors in language generation,” Roe said.

'Misalignment matters to OpenAI’s bottom line'

Ultimately, the success and broader adoption of OpenAI’s text automation models may be dependent on whether they actually do what people and businesses want them to. Indeed, the mission to improve GPT’s alignment is a financial matter as well as one of accuracy or ethics for the company, according to an AI researcher who led OpenAI’s alignment team in 2020 and has since left the company.

“[B]ecause GPT-3 is already being deployed in the OpenAI API, its misalignment matters to OpenAI’s bottom line — it would be much better if we had an API that was trying to help the user instead of trying to predict the next word of text from the internet,” wrote the former head of OpenAI’s language model alignment team, Paul Christiano, in 2020, in a bid to find additional ML engineers and researchers to assist to solve alignment problems at the company.

At the time, OpenAI had recently introduced GPT-3, the third version of its Generative Pre-trained Transformer natural language processing system. The company is still looking for additional engineers to join its alignment team.

Notably, InstructGPT cost less to build than GPT-3 because it used far fewer parameters, which are essentially elements chosen by the neural network to help it learn and improve. “The cost of collecting our data and the compute for training runs, including experimental ones is a fraction of what was spent to train GPT-3,” said OpenAI researchers in a paper describing how InstructGPT was developed.

Like other foundational natural-language processing AI technologies, GPT has been employed by a variety of companies, particularly to develop chatbots. But it’s not the right type of language processing AI for all purposes, said Nitzan Mekel-Bobrov, eBay’s chief artificial intelligence officer. While eBay has used GPT, the ecommerce company has relied more heavily on another open-source language model, BERT, said Mekel-Bobrov.

“We feel that the technology is just more advanced,” said Mekel-Bobrov regarding BERT, which stands for Bidirectional Encoder Representations from Transformers. EBay typically uses AI-based language models to help understand or predict customer intent rather than to generate automated responses for customer service, something he said BERT is better suited for than early versions of GPT.

“We are still in the process of figuring out the balance between automated dialogue and text generation as something customers can benefit from,” he said.

About the bias and hallucinations…

GPT-3 and other natural-language processing AI models have been criticized for producing text that perpetuates stereotypes and spews “toxic” language, in part because they were trained using data gleaned from an internet that’s permeated by that very sort of nasty word-smithing.

In fact, research published in June revealed that when prompted with the phrase, “Two Muslims walk into a …,” GPT-3 generated text referencing violent acts two-thirds of the time in 100 tries. Using the terms “Christians,” “Jews,” or “Sikhs” in place of “Muslims” resulted in violent references 20% or less of the time.

OpenAI said in its research paper that “InstructGPT shows small improvements in toxicity over GPT-3,” according to some metrics, but not in others.

“Bias still remains one of the big issues especially since everyone is using a small number of foundation models,” said Mekel-Bobrov. He added that bias in natural-language processing AI such as earlier versions of GPT “has very broad ramifications, but they’re not necessarily very easy to detect because they’re buried in the foundational [AI].”

He said his team at eBay attempts to decipher how foundational language models work in a methodical manner to help identify bias. “It’s important not just to use their capabilities as black boxes,” he said.

GPT-3 has also been shown to conjure up false information. While OpenAI said InstructGPT lies less often than GPT-3 does, there is more work to be done on that front, too. The company’s researchers gauged the new model’s “hallucination rate,” noting, “InstructGPT models make up information half as often as GPT-3 (a 21% vs. 41% hallucination rate, respectively).”

Leike said OpenAI is aware that even InstructGPT “can still be misused” because the technology is “neither fully aligned or fully safe.” However, he said, “It is way better at following human intent.”

Policy

Musk’s texts reveal what tech’s most powerful people really want

From Jack Dorsey to Joe Rogan, Musk’s texts are chock-full of überpowerful people, bending a knee to Twitter’s once and (still maybe?) future king.

“Maybe Oprah would be interested in joining the Twitter board if my bid succeeds,” one text reads.

Photo illustration: Patrick Pleul/picture alliance via Getty Images; Protocol

Elon Musk’s text inbox is a rarefied space. It’s a place where tech’s wealthiest casually commit to spending billions of dollars with little more than a thumbs-up emoji and trade tips on how to rewrite the rules for how hundreds of millions of people around the world communicate.

Now, Musk’s ongoing legal battle with Twitter is giving the rest of us a fleeting glimpse into that world. The collection of Musk’s private texts that was made public this week is chock-full of tech power brokers. While the messages are meant to reveal something about Musk’s motivations — and they do — they also say a lot about how things get done and deals get made among some of the most powerful people in the world.

Keep Reading Show less
Issie Lapowsky

Issie Lapowsky ( @issielapowsky) is Protocol's chief correspondent, covering the intersection of technology, politics, and national affairs. She also oversees Protocol's fellowship program. Previously, she was a senior writer at Wired, where she covered the 2016 election and the Facebook beat in its aftermath. Prior to that, Issie worked as a staff writer for Inc. magazine, writing about small business and entrepreneurship. She has also worked as an on-air contributor for CBS News and taught a graduate-level course at New York University's Center for Publishing on how tech giants have affected publishing.

Sponsored Content

Great products are built on strong patents

Experts say robust intellectual property protection is essential to ensure the long-term R&D required to innovate and maintain America's technology leadership.

Every great tech product that you rely on each day, from the smartphone in your pocket to your music streaming service and navigational system in the car, shares one important thing: part of its innovative design is protected by intellectual property (IP) laws.

From 5G to artificial intelligence, IP protection offers a powerful incentive for researchers to create ground-breaking products, and governmental leaders say its protection is an essential part of maintaining US technology leadership. To quote Secretary of Commerce Gina Raimondo: "intellectual property protection is vital for American innovation and entrepreneurship.”

Keep Reading Show less
James Daly
James Daly has a deep knowledge of creating brand voice identity, including understanding various audiences and targeting messaging accordingly. He enjoys commissioning, editing, writing, and business development, particularly in launching new ventures and building passionate audiences. Daly has led teams large and small to multiple awards and quantifiable success through a strategy built on teamwork, passion, fact-checking, intelligence, analytics, and audience growth while meeting budget goals and production deadlines in fast-paced environments. Daly is the Editorial Director of 2030 Media and a contributor at Wired.
Fintech

Circle’s CEO: This is not the time to ‘go crazy’

Jeremy Allaire is leading the stablecoin powerhouse in a time of heightened regulation.

“It’s a complex environment. So every CEO and every board has to be a little bit cautious, because there’s a lot of uncertainty,” Circle CEO Jeremy Allaire told Protocol at Converge22.

Photo: Circle

Sitting solo on a San Francisco stage, Circle CEO Jeremy Allaire asked tennis superstar Serena Williams what it’s like to face “unrelenting skepticism.”

“What do you do when someone says you can’t do this?” Allaire asked the athlete turned VC, who was beaming into Circle’s Converge22 convention by video.

Keep Reading Show less
Benjamin Pimentel

Benjamin Pimentel ( @benpimentel) covers crypto and fintech from San Francisco. He has reported on many of the biggest tech stories over the past 20 years for the San Francisco Chronicle, Dow Jones MarketWatch and Business Insider, from the dot-com crash, the rise of cloud computing, social networking and AI to the impact of the Great Recession and the COVID crisis on Silicon Valley and beyond. He can be reached at bpimentel@protocol.com or via Google Voice at (925) 307-9342.

Enterprise

Is Salesforce still a growth company? Investors are skeptical

Salesforce is betting that customer data platform Genie and new Slack features can push the company to $50 billion in revenue by 2026. But investors are skeptical about the company’s ability to deliver.

Photo: Marlena Sloss/Bloomberg via Getty Images

Salesforce has long been enterprise tech’s golden child. The company said everything customers wanted to hear and did everything investors wanted to see: It produced robust, consistent growth from groundbreaking products combined with an aggressive M&A strategy and a cherished culture, all operating under the helm of a bombastic, but respected, CEO and team of well-coiffed executives.

Dreamforce is the embodiment of that success. Every year, alongside frustrating San Francisco residents, the over-the-top celebration serves as a battle cry to the enterprise software industry, reminding everyone that Marc Benioff’s mighty fiefdom is poised to expand even deeper into your corporate IT stack.

Keep Reading Show less
Joe Williams

Joe Williams is a writer-at-large at Protocol. He previously covered enterprise software for Protocol, Bloomberg and Business Insider. Joe can be reached at JoeWilliams@Protocol.com. To share information confidentially, he can also be contacted on a non-work device via Signal (+1-309-265-6120) or JPW53189@protonmail.com.

Policy

The US and EU are splitting on tech policy. That’s putting the web at risk.

A conversation with Cédric O, the former French minister of state for digital.

“With the difficulty of the U.S. in finding political agreement or political basis to legislate more, we are facing a risk of decoupling in the long term between the EU and the U.S.”

Photo: David Paul Morris/Bloomberg via Getty Images

Cédric O, France’s former minister of state for digital, has been an advocate of Europe’s approach to tech and at the forefront of the continent’s relations with U.S. giants. Protocol caught up with O last week at a conference in New York focusing on social media’s negative effects on society and the possibilities of blockchain-based protocols for alternative networks.

O said watching the U.S. lag in tech policy — even as some states pass their own measures and federal bills gain momentum — has made him worry about the EU and U.S. decoupling. While not as drastic as a disentangling of economic fortunes between the West and China, such a divergence, as O describes it, could still make it functionally impossible for companies to serve users on both sides of the Atlantic with the same product.

Keep Reading Show less
Ben Brody

Ben Brody (@ BenBrodyDC) is a senior reporter at Protocol focusing on how Congress, courts and agencies affect the online world we live in. He formerly covered tech policy and lobbying (including antitrust, Section 230 and privacy) at Bloomberg News, where he previously reported on the influence industry, government ethics and the 2016 presidential election. Before that, Ben covered business news at CNNMoney and AdAge, and all manner of stories in and around New York. He still loves appearing on the New York news radio he grew up with.

Latest Stories
Bulletins