Amid debate, Microsoft and Google continue to use emotion-detection AI, with limits

Microsoft said accessibility goals overrode problems with emotion recognition and Google offers off-the-shelf emotion recognition technology amid growing concern over the controversial AI.

“Seeing AI becomes my assistant and provides me with support that I need so that I can focus on taking care of my daughter,” says Akiko Ishii, with her daughter, Ami, in Tokyo. Photo by Microsoft.

Emotion recognition is a well-established field of computer vision research; however, AI-based technologies used in an attempt to assess people’s emotional states have moved beyond the research phase.

Photo: Microsoft

Microsoft said last month it would no longer provide general use of an AI-based cloud software feature used to infer people’s emotions. However, despite its own admission that emotion recognition technology creates “risks,” it turns out the company will retain its emotion recognition capability in an app used by people with vision loss.

In fact, amid growing concerns over development and use of controversial emotion recognition in everyday software, both Microsoft and Google continue to incorporate the AI-based features in their products.

“The Seeing AI person channel enables you to recognize people and to get a description of them, including an estimate of their age and also their emotion,” said Saqib Shaikh, a software engineering manager and project lead for Seeing AI at Microsoft who helped build the app, in a tutorial about the product in a 2017 Microsoft video.

After he snapped a photo of his friend using Seeing AI, the app’s automated voice announced that he is a “36-year-old male wearing glasses, looking happy.” Shaikh added, “That’s really cool because at that moment in time you can find out what someone’s facial expression was.”

Microsoft said on June 21 that it will “retire” its facial analysis capabilities that attempt to detect people’s emotional states, gender, age and other attributes. The company pointed to privacy concerns, “the lack of consensus on a definition of ‘emotions’” and the “inability to generalize the linkage between facial expression and emotional state across use cases, regions, and demographics.”

But accessibility goals overrode those problems when it came to Seeing AI. “We worked alongside people from the blind and low vision community who provided key feedback that the emotion recognition feature is important to them, in order to close the equity gap between them and [the] experience of sighted individuals,” said Microsoft in a statement sent to Protocol. The company declined a request for an interview.

“I really do appreciate Microsoft’s nuance here,” said Margaret Mitchell, chief ethics scientist and researcher at Hugging Face and a Ph.D. in computer science who helped develop Seeing AI in 2014 while working at Microsoft. She left the company in 2016. “When you talk to people who are blind you will see there is absolutely an appreciation for description of visual scenes,” Mitchell said.

Saqib Shaikh, a Microsoft software engineering manager and project lead for Seeing AI with Microsoft CEO Satya NadellaSaqib Shaikh, a Microsoft software engineering manager and project lead for Seeing AI (left) with Microsoft CEO Satya Nadella Photo: Justin Sullivan/Getty Images

Emotion recognition is a well-established field of computer vision research; however, AI-based technologies used in an attempt to assess people’s emotional states have moved beyond the research phase. They have been integrated into everyday tech products like virtual meeting platforms, online classroom platforms and software in vehicles used to detect driver distraction or road rage.

Off-the-shelf emotion detection from Google

Google has also grappled with decisions about incorporating computer vision-based AI that attempts to gauge the likelihood that a person is expressing certain emotions or facial characteristics.

The company’s Cloud Vision API includes “pre-trained Vision API models to detect emotion, understand text, and more,” according to a company description. The system rates the likelihood that a face in an image is expressing anger, joy, sorrow and surprise on a scale from “unknown” or “very unlikely” to “very likely.”

Google also includes a feature in its ML Kit tool for mobile apps that detects facial “landmarks” and classifies facial characteristics to show whether or not someone has their eyes open or is smiling.

Even though its own documentation sometimes claims its software “detects emotion,” a Google spokesperson played down that idea, noting that its Vision API does not detect expressed emotions; rather it predicts the perception of facially expressed emotions.

The validity of emotion AI has been heavily scrutinized and often raises ethical concerns. Advocacy groups including the AI Now Institute and Brookings Institution have called for bans on the technology for certain use cases.

After Protocol reported that virtual meeting platform Zoom was interested in using emotion recognition, more than 25 human and digital rights organizations including the American Civil Liberties Union, Electronic Privacy Information Center and Fight for the Future demanded that the company end any plans to use it.

Google declined an interview for this story, instead pointing to a 2021 Reuters report that explained that, following an internal ethics review, the company decided against including new capabilities in its Cloud Vision API tool to detect the likelihood of additional emotions other than anger, joy, sorrow and surprise. According to the story, the group “determined that inferring emotions could be insensitive because facial cues are associated differently with feelings across cultures, among other reasons.”

Mitchell told Protocol that during her time working for Google, she was part of the group that helped convince the company not to expand Cloud Vision API’s features to infer additional emotional states other than the original four.

Mitchell, who co-led Google’s ethical AI team, was fired from the company in February 2021 following a company investigation into violations of security policies for moving company files. Her departure followed another high-profile firing of her AI ethics team co-lead Timnit Gebru. Gebru was fired in part as a result of conflict over a research paper questioning the environmental, financial and societal costs of large-language machine-learning models.

It is unclear whether Google’s decision to limit the tool to four emotions avoids inaccuracies. For instance, when one researcher tested Google’s Cloud Vision API in 2019, he applied the tool to assess the sentiments of faces of a group of children in a photo. In the moment the snapshot was taken, everyone but one boy was smiling. The system appeared to default to something that may have been incorrect, determining that the boy’s face was expressing “sorrow with a confidence of 78%.”

Reductive accessibility

Researchers are pushing to advance emotion AI. At the international Computer Vision and Pattern Recognition conference held in New Orleans in June, some accepted research papers involved work related to facial expression recognition and facial landmark detection, for example.

“We’re just breaking the surface, and everywhere I turn there’s more and more [emotion AI] developing,” said Nirit Pisano, chief psychology officer at Cognovi Labs, which provides emotion AI technology to advertisers and pharmaceutical-makers that use it to determine responses to marketing messages and to understand how people feel about certain drugs.

“I definitely see its uses, and I also envision many of its misuses. I think the mission of a company is really critical,” Pisano said.

Seeing AI app - Scene Channelwww.youtube.com

Microsoft said its decision to continue use of emotion recognition in Seeing AI will help advance its accessibility mission. “Microsoft remains committed to supporting technology for people with disabilities and will continue to use these capabilities in support of this goal by integrating them into applications such as Seeing AI,” wroteSarah Bird, principal group product manager at Microsoft’s Azure AI, in a company blog post last month.

Gebru, who has a Ph.D. in computer vision, is critical of emotion recognition technology, which uses computer vision to detect facial data. She told Protocol that although “there are many times where access is used as a reason” for emotion recognition — such as to improve accessibility for people with vision impairment — whether it can be beneficial “all depends on what people in that community have said.”

I definitely see its uses, and I also envision many of its misuses. I think the mission of a company is really critical.

Seeing AI is only accessible to people with Apple devices, even though people have requested that the company create an Android version of the app. “Currently on the Android side, there are alternatives such as Speak, [Supersense], Envision AI, Kibo etc, but that’s no excuse for not having Seeing AI,” wrote a Reddit user in the R/Blind channel two years ago.

The Seeing AI app gets positive reviews online; however, rather than using it to detect emotion, some people seem more interested in using it to accomplish tasks such as deciphering the denomination of paper money, helping to read mail and determining whether food in the fridge has expired.

Cristian Sainz uses Seeing AI at home to scan the bar code of a jar of peaches from his fridge. Photo by Microsoft.Cristian Sainz uses Seeing AI at home to scan the bar code of a jar of peaches from his fridge. Photo: Microsoft

Still, tools such as Seeing AI can help someone who cannot see navigate a conversation by picking up on cues they miss when people only nod or make facial expressions rather than audibly communicating.

“Deafblind people face higher levels of depression in part because ableist barriers often exclude us from conversations,” wrote Haben Girma, a deafblind human rights lawyer, in a June tweet noting the benefits of providing details about imagery that can help people “receive the emotional message through words.”

Even if only used to assist people with vision loss, Mitchell said there could be better ways to build emotion recognition AI. Labeling facial expressions to indicate emotions that are then spoken by an app’s computerized voice may not be the most helpful approach, she said, suggesting that things like electronic pulses or tones could be used instead to convey the visible facial expressions in a way that could be more clearly understood.

“It doesn’t actually need to be the case for blind people that they need to have this point of discrete categorization,” Mitchell said. “It seems to be an undesirable bottleneck and a reductive form of signal processing that doesn’t actually need to be there for someone who is blind.”


Judge Zia Faruqui is trying to teach you crypto, one ‘SNL’ reference at a time

His decisions on major cryptocurrency cases have quoted "The Big Lebowski," "SNL," and "Dr. Strangelove." That’s because he wants you — yes, you — to read them.

The ways Zia Faruqui (right) has weighed on cases that have come before him can give lawyers clues as to what legal frameworks will pass muster.

Photo: Carolyn Van Houten/The Washington Post via Getty Images

“Cryptocurrency and related software analytics tools are ‘The wave of the future, Dude. One hundred percent electronic.’”

That’s not a quote from "The Big Lebowski" — at least, not directly. It’s a quote from a Washington, D.C., district court memorandum opinion on the role cryptocurrency analytics tools can play in government investigations. The author is Magistrate Judge Zia Faruqui.

Keep ReadingShow less
Veronica Irwin

Veronica Irwin (@vronirwin) is a San Francisco-based reporter at Protocol covering fintech. Previously she was at the San Francisco Examiner, covering tech from a hyper-local angle. Before that, her byline was featured in SF Weekly, The Nation, Techworker, Ms. Magazine and The Frisc.

The financial technology transformation is driving competition, creating consumer choice, and shaping the future of finance. Hear from seven fintech leaders who are reshaping the future of finance, and join the inaugural Financial Technology Association Fintech Summit to learn more.

Keep ReadingShow less
The Financial Technology Association (FTA) represents industry leaders shaping the future of finance. We champion the power of technology-centered financial services and advocate for the modernization of financial regulation to support inclusion and responsible innovation.

AWS CEO: The cloud isn’t just about technology

As AWS preps for its annual re:Invent conference, Adam Selipsky talks product strategy, support for hybrid environments, and the value of the cloud in uncertain economic times.

Photo: Noah Berger/Getty Images for Amazon Web Services

AWS is gearing up for re:Invent, its annual cloud computing conference where announcements this year are expected to focus on its end-to-end data strategy and delivering new industry-specific services.

It will be the second re:Invent with CEO Adam Selipsky as leader of the industry’s largest cloud provider after his return last year to AWS from data visualization company Tableau Software.

Keep ReadingShow less
Donna Goodison

Donna Goodison (@dgoodison) is Protocol's senior reporter focusing on enterprise infrastructure technology, from the 'Big 3' cloud computing providers to data centers. She previously covered the public cloud at CRN after 15 years as a business reporter for the Boston Herald. Based in Massachusetts, she also has worked as a Boston Globe freelancer, business reporter at the Boston Business Journal and real estate reporter at Banker & Tradesman after toiling at weekly newspapers.

Image: Protocol

We launched Protocol in February 2020 to cover the evolving power center of tech. It is with deep sadness that just under three years later, we are winding down the publication.

As of today, we will not publish any more stories. All of our newsletters, apart from our flagship, Source Code, will no longer be sent. Source Code will be published and sent for the next few weeks, but it will also close down in December.

Keep ReadingShow less
Bennett Richardson

Bennett Richardson ( @bennettrich) is the president of Protocol. Prior to joining Protocol in 2019, Bennett was executive director of global strategic partnerships at POLITICO, where he led strategic growth efforts including POLITICO's European expansion in Brussels and POLITICO's creative agency POLITICO Focus during his six years with the company. Prior to POLITICO, Bennett was co-founder and CMO of Hinge, the mobile dating company recently acquired by Match Group. Bennett began his career in digital and social brand marketing working with major brands across tech, energy, and health care at leading marketing and communications agencies including Edelman and GMMB. Bennett is originally from Portland, Maine, and received his bachelor's degree from Colgate University.


Why large enterprises struggle to find suitable platforms for MLops

As companies expand their use of AI beyond running just a few machine learning models, and as larger enterprises go from deploying hundreds of models to thousands and even millions of models, ML practitioners say that they have yet to find what they need from prepackaged MLops systems.

As companies expand their use of AI beyond running just a few machine learning models, ML practitioners say that they have yet to find what they need from prepackaged MLops systems.

Photo: artpartner-images via Getty Images

On any given day, Lily AI runs hundreds of machine learning models using computer vision and natural language processing that are customized for its retail and ecommerce clients to make website product recommendations, forecast demand, and plan merchandising. But this spring when the company was in the market for a machine learning operations platform to manage its expanding model roster, it wasn’t easy to find a suitable off-the-shelf system that could handle such a large number of models in deployment while also meeting other criteria.

Some MLops platforms are not well-suited for maintaining even more than 10 machine learning models when it comes to keeping track of data, navigating their user interfaces, or reporting capabilities, Matthew Nokleby, machine learning manager for Lily AI’s product intelligence team, told Protocol earlier this year. “The duct tape starts to show,” he said.

Keep ReadingShow less
Kate Kaye

Kate Kaye is an award-winning multimedia reporter digging deep and telling print, digital and audio stories. She covers AI and data for Protocol. Her reporting on AI and tech ethics issues has been published in OneZero, Fast Company, MIT Technology Review, CityLab, Ad Age and Digiday and heard on NPR. Kate is the creator of RedTailMedia.org and is the author of "Campaign '08: A Turning Point for Digital Media," a book about how the 2008 presidential campaigns used digital media and data.

Latest Stories