Why AI fairness tools might actually cause more problems

Important nuances were lost in translation when a rule commonly used to measure disparate impacts on protected groups in hiring was codified for easy-to-use tools promising AI fairness and bias removal.

Female team leader standing in board room, providing feedback on business strategy to multi racial colleagues, forecasting and projecting

Important nuances were lost in translation.

Photo: 10'000 Hours/DigitalVision

Salesforce uses it. So do H20.ai and other AI tool makers. But instead of detecting the discriminatory impact of AI used for employment and recruitment, the “80% rule” — also known as the 4/5 rule — could be introducing new problems.

In fact, AI ethics researchers say harms that disparately affect some groups could be exacerbated as the rule is baked into tools used by machine-learning developers hoping to reduce discriminatory effects of the models they build.

“The field has amplified the potential for harm in codifying the 4/5 rule into popular AI fairness software toolkits,” wrote researchers Jiahao Chen, Michael McKenna and Elizabeth Anne Watkins in an academic paper published earlier this year. “The harmful erasure of legal nuances is a wake-up call for computer scientists to self-critically re-evaluate the abstractions they create and use, particularly in the interdisciplinary field of AI ethics.”

The rule has been used by federal agencies, including the Departments of Justice and Labor, the Equal Employment Opportunity Commission and others, as a way to compare the hiring rate of protected groups and white people and determine whether hiring practices have led to discriminatory impacts.

The goal of the rule is to encourage companies to hire protected groups at a rate that is at least 80% that of white men. For example, if the hired rate for white men is 60% but only 45% for Black people, the ratio of the two hiring rates would be 45:60 — or 75% — which does not meet the rule’s 80% threshold. Federal guidance on using the rule for employment purposes has been updated over the years to incorporate other factors.

The use of the rule in fairness tools emerged when computer engineers sought a way to abstract the technique used by social scientists as a foundational approach to measuring disparate impact into numbers and code, said Watkins, a social scientist and postdoctoral research associate at Princeton University’s Center for Information Technology Policy and the Human-Computer Interaction Group.

“In computer science, there’s a way to abstract everything. Everything can be boiled down to numbers,” Watkins told Protocol. But important nuances got lost in translation when the rule was digitized and codified for easy bias-removal tools.

When applied in real-life scenarios, the rule is typically applied as a first step in a longer process intended to understand why disparate impact has occurred and how to fix it. However, oftentimes engineers use fairness tools at the end of a development process, as a last box to check before a product or machine-learning model is shipped.

“It’s actually become the reverse, where it’s at the end of a process,” said Watkins, who studies how computer scientists and engineers do their AI work. “It’s being completely inverted from what it was actually supposed to do … The human element of the decision-making gets lost.”

The simplistic application of the rule also misses other important factors weighed in traditional assessments. For instance, researchers usually want to inspect which subsections of applicant groups should be measured using the rule.

To have 19% disparate impact and say that’s legally safe when you can confidently measure disparate impact at 1% or 2% is deeply unethical,

Other researchers also have inspected AI ethics toolkits to examine how they relate to actual ethics work.

The rule used on its own is a blunt instrument and not sophisticated enough to meet today’s standards, said Danny Shayman, AI and machine-learning product manager at InRule, a company that sells automated intelligence software to employment, insurance and financial services customers.

“To have 19% disparate impact and say that’s legally safe when you can confidently measure disparate impact at 1% or 2% is deeply unethical,” said Shayman, who added that AI-based systems can confidently measure impact in a far more nuanced way.

Model drifting into another lane

But the rule is making its way into tools AI developers use in the hopes of removing disparate impacts against vulnerable groups and detecting bias.

“The 80% threshold is the widely used standard for detecting disparate impact,” notes Salesforce in its description of its bias detection methodology, which incorporates the rule to flag data for possible bias problems. “Einstein Discovery raises this data alert when, for a sensitive variable, the selection data for one group is less than 80% of the group with the highest selection rate.”

H20.ai also refers to the rule in documentation about how disparate impact analysis and mitigation works in its software.

Neither Salesforce nor H20.ai responded to requests to comment for this story.

The researchers also argued that translating a rule used in federal employment law into AI fairness tools could divert it into terrain outside the normal context of hiring decisions, such as banking and housing. They said this amounts to epistemic trespassing, or the practice of making judgements in arenas outside an area of expertise.

“In reality, no evidence exists for its adoption into other domains,” they wrote regarding the rule. “In contrast, many toolkits [encourage] this epistemic trespassing, creating a self-fulfilling prophecy of relevance spillover, not just into other U.S. regulatory contexts, but even into non-U.S. jurisdictions!”

Watkins’ research collaborators work for Parity, an algorithmic audit company that may benefit from deterring use of off-the-shelf fairness tools. Chen, chief technology officer of Parity and McKenna, the company’s data science director, are currently involved in a legal dispute with Parity’s CEO.

Although application of the rule in AI fairness tools can create unintended problems, Watkins said she did not want to demonize computer engineers for using it.

“The reason this metric is being implemented is developers want to do better,” she said. “They are not incentivized in [software] development cycles to do that slow, deeper work. They need to collaborate with people trained to abstract and trained to understand those spaces that are being abstracted.”


Niantic’s future hinges on mapping the metaverse

The maker of Pokémon Go is hoping the metaverse will deliver its next big break.

Niantic's new standalone messaging and social app, Campfire, is a way to get players organizing and meeting up in the real world. It launches today for select Pokémon Go players.

Image: Niantic

Pokémon Go sent Niantic to the moon. But now the San Francisco-based augmented reality developer has returned to earth, and it’s been trying to chart its way back to the stars ever since. The company yesterday announced layoffs of about 8% of its workforce (about 85 to 90 people) and canceled four projects, Bloomberg reported, signaling another disappointment for the studio that still generates about $1 billion in revenue per year from Pokémon Go.

Finding its next big hit has been Niantic’s priority for years, and the company has been coming up short. For much of the past year or so, Niantic has turned its attention to the metaverse, with hopes that its location-based mobile games, AR tech and company philosophy around fostering physical connection and outdoor exploration can help it build what it now calls the “real world metaverse.”

Keep Reading Show less
Nick Statt

Nick Statt is Protocol's video game reporter. Prior to joining Protocol, he was news editor at The Verge covering the gaming industry, mobile apps and antitrust out of San Francisco, in addition to managing coverage of Silicon Valley tech giants and startups. He now resides in Rochester, New York, home of the garbage plate and, completely coincidentally, the World Video Game Hall of Fame. He can be reached at nstatt@protocol.com.

Every day, millions of us press the “order” button on our favorite coffee store's mobile application: Our chosen brew will be on the counter when we arrive. It’s a personalized, seamless experience that we have all come to expect. What we don’t know is what’s happening behind the scenes. The mobile application is sourcing data from a database that stores information about each customer and what their favorite coffee drinks are. It is also leveraging event-streaming data in real time to ensure the ingredients for your personal coffee are in supply at your local store.

Applications like this power our daily lives, and if they can’t access massive amounts of data stored in a database as well as stream data “in motion” instantaneously, you — and millions of customers — won’t have these in-the-moment experiences.

Keep Reading Show less
Jennifer Goforth Gregory
Jennifer Goforth Gregory has worked in the B2B technology industry for over 20 years. As a freelance writer she writes for top technology brands, including IBM, HPE, Adobe, AT&T, Verizon, Epson, Oracle, Intel and Square. She specializes in a wide range of technology, such as AI, IoT, cloud, cybersecurity, and CX. Jennifer also wrote a bestselling book The Freelance Content Marketing Writer to help other writers launch a high earning freelance business.

Supreme Court takes a sledgehammer to greenhouse gas regulations

The court ruled 6-3 that the EPA cannot use the Clean Air Act to regulate power plant greenhouse gas emissions. That leaves a patchwork of policies from states, utilities and, increasingly, tech companies to pick up the slack.

The Supreme Court struck a major blow to the federal government's ability to regulate greenhouse gases.

Eric Lee/Bloomberg via Getty Images

Striking down the right to abortion may be the Supreme Court's highest-profile decision this term. But on Thursday, the court handed down an equally massive verdict on the federal government's ability to regulate greenhouse gas emissions. In the case of West Virginia v. EPA, the court decided that the agency has no ability to regulate greenhouse gas pollution under the Clean Air Act. Weakening the federal government's powers leaves a patchwork of states, utilities and, increasingly, tech companies to pick up the slack in reducing carbon pollution.

Keep Reading Show less
Brian Kahn

Brian ( @blkahn) is Protocol's climate editor. Previously, he was the managing editor and founding senior writer at Earther, Gizmodo's climate site, where he covered everything from the weather to Big Oil's influence on politics. He also reported for Climate Central and the Wall Street Journal. In the even more distant past, he led sleigh rides to visit a herd of 7,000 elk and boat tours on the deepest lake in the U.S.


Can crypto regulate itself? The Lummis-Gillibrand bill hopes so.

Creating the equivalent of the stock markets’ FINRA for crypto is the ideal, but experts doubt that it will be easy.

The idea of creating a government-sanctioned private regulatory association has been drawing more attention in the debate over how to rein in a fast-growing industry whose technological quirks have baffled policymakers.

Illustration: Christopher T. Fong/Protocol

Regulating crypto is complicated. That’s why Sens. Cynthia Lummis and Kirsten Gillibrand want to explore the creation of a private sector group to help federal regulators do their job.

The bipartisan bill introduced by Lummis and Gillibrand would require the CFTC and the SEC to work with the crypto industry to look into setting up a self-regulatory organization to “facilitate innovative, efficient and orderly markets for digital assets.”

Keep Reading Show less
Benjamin Pimentel

Benjamin Pimentel ( @benpimentel) covers crypto and fintech from San Francisco. He has reported on many of the biggest tech stories over the past 20 years for the San Francisco Chronicle, Dow Jones MarketWatch and Business Insider, from the dot-com crash, the rise of cloud computing, social networking and AI to the impact of the Great Recession and the COVID crisis on Silicon Valley and beyond. He can be reached at bpimentel@protocol.com or via Google Voice at (925) 307-9342.


Alperovitch: Cybersecurity defenders can’t be on high alert every day

With the continued threat of Russian cyber escalation, cybersecurity and geopolitics expert Dmitri Alperovitch says it’s not ideal for the U.S. to oscillate between moments of high alert and lesser states of cyber readiness.

Dmitri Alperovitch (the co-founder and former CTO of CrowdStrike) speaks at RSA Conference 2022.

Photo: RSA Conference

When it comes to cybersecurity vigilance, Dmitri Alperovitch wants to see more focus on resiliency of IT systems — and less on doing "surges" around particular dates or events.

For instance, whatever Russia is doing at the moment.

Keep Reading Show less
Kyle Alspach

Kyle Alspach ( @KyleAlspach) is a senior reporter at Protocol, focused on cybersecurity. He has covered the tech industry since 2010 for outlets including VentureBeat, CRN and the Boston Globe. He lives in Portland, Oregon, and can be reached at kalspach@protocol.com.

Latest Stories