Coronavirus is AI moderation’s big test. Don’t expect flying colors.
The pandemic forced some social networks to send home content moderators and instead rely on AI. What happens now?
Tech CEOs have said for years that artificial intelligence is the future of content moderation. They never expected that the future would come so soon.
As the coronavirus outbreak sweeps the planet, Facebook, YouTube and Twitter have sent their legions of global content moderators home. That leaves the task of identifying and removing hate speech, child sexual abuse imagery, terrorist propaganda, violent imagery and many other types of offensive content largely to the algorithms these companies have been steadily building over the last few years, even as they built their content moderation armies.
Get what matters in tech, in your inbox every morning. Sign up for Source Code.
What this AI-moderated future will look like is unclear. But for Facebook, moderation will be "a little less effective in the near-term" as a result of the changes, Mark Zuckerberg said Thursday on a call with reporters. If the world learns anything from this unavoidable test of automated moderation, it will be that artificially intelligent systems still have a lot to learn.
Facebook has committed to sending all of its contract workers home, but will continue to pay them for the foreseeable future. That's not a decision the company would make lightly: Over the last few years, Facebook has invested massively in contracting content moderators around the world. But watching beheading videos and suicide attempts on a screen all day is dangerous enough for moderators' mental health when they're working in an office; imagine doing it at home, potentially with family around. "Working from home on those types of things, that will be very challenging to enforce that people are getting the mental health support that they needed," Zuckerberg said. And that's to say nothing of the privacy concerns of doing the work from home.
The decision is no less challenging for YouTube, which is sending thousands of global moderators home as well and leaning into algorithmic moderation. "We will temporarily start relying more on technology to help with some of the work normally done by reviewers," the company announced Monday in a blog post. "This means automated systems will start removing some content without human review, so we can continue to act quickly to remove violative content."
Twitter, meanwhile, will institute a "triage system" to prioritize the most harmful content as its moderation team goes home. Twitter hasn't yet shared additional details on how, exactly, that will work.
For as much time and money as these companies have spent on developing AI moderation systems, they know — as Zuckerberg conceded — that they're still a long way from perfect.
"We want to be clear: While we work to ensure our systems are consistent, they can sometimes lack the context that our teams bring," echoed Twitter employees Vijaya Gadde and Matt Derella in a blog post.
YouTube's past efforts to automatically crack down on banned content have led to massive takedowns of videos that researchers and educators relied on to document, for example, the war in Syria.
Tech giants' transparency reports show that these machine learning systems already have varying levels of success in identifying different kinds of problematic content. For instance, Facebook is fairly successful at spotting terrorist content. In the second and third quarters of last year, it removed 98% of terrorist propaganda automatically, before a single user reported it. It's less successful when it comes to content that's more open to interpretation. During the same quarter, Facebook proactively removed just 80% of hate speech-related content before any users reported it.
"One of the things that's really, really hard — and has always been hard — is when people post bad content that's removable, but they post it in protest or to raise awareness," said Kate Klonick, an assistant professor of law at St. John's University. "Generally, the biggest threat is going to be over-censorship rather than under-censorship."
This period will undoubtedly see some overzealous removal of content and longer wait times for users appealing the platforms' decisions. For that reason, the tech giants are also tinkering with the rules they've clung to in the past.
YouTube has said that during this period, it won't issue strikes on content except for cases in which the platform has "high confidence that it's violative." Typically, a user receives one warning, then up to three strikes before a channel is permanently removed from YouTube. This switch, of course, opens YouTube up to the type of accusations of bias that its strike system has sought to avoid in the past. The company will also take extra precautions with livestreams, preventing some unreviewed content from being searchable or available via the homepage or recommendations feature.
Twitter, meanwhile, has expanded its definition of "harm" to include content related to COVID-19 "that goes directly against guidance from authoritative sources of global and local public health information." And, like YouTube, it's also adjusting its rules so that it will no longer suspend accounts permanently based on automated enforcement.
Despite the inevitable challenges, there are those, like Marco Iansiti, professor at Harvard Business School and author of "Competing in the Age of AI," who believe this episode will "drive even more innovation on the AI front," referencing the advances in manufacturing and air travel that followed World War II.
Some of those advancements were already underway. Last fall, for instance, Facebook explained a new capability it's deploying known as Whole Post Integrity Embeddings, which allows its AI systems to simultaneously analyze the entirety of a post — the images and the text — for signs of a violation. This can be especially helpful for posts where context is key, like, say illegal drug sales. That particular innovation seems to be making a difference: Facebook reported deleting about 4.4 million pieces of drug sale content in the third quarter of 2019, 97.6% of which was proactively detected. That's compared with just 841,000 pieces of the same type of content in the first quarter of that year, of which 84% was flagged by automation.
These developments will be critical to Facebook and other tech giants as they try to combat a globe's worth of problematic content without their usual first line of defense. Yet, there are some categories of forbidden content that are simply too risky to leave up to technology, meaning all of these companies will continue to have at least some human beings screening the most harmful posts.
Twitter said it's working on providing wellness resources to moderators to continue to allow them to work from home. Going forward, Facebook will task some of its full-time employees with moderating the most imminently dangerous content, like posts related to suicide and self-injury, while it doesn't have its army of contractors on hand. "We basically shifted our work, so that it's more full-time work force, and we're surging some of that, which inherently is going to create a trade-off against some other types of content that may not have as imminent physical risks for people," Zuckerberg said.
That may mean that some of Facebook's full-time employees will continue to work on these issues in-person, Zuckerberg said, "in the same way that first responders and health organizations or the police need to act to work on different threats." Facebook says that these full-time employees will all have the option to work from home, and those who do will have weekly one-on-one check-ins with licensed therapists. They'll also have weekly virtual group therapy sessions.
Get in touch with us: Share information securely with Protocol via encrypted Signal or WhatsApp message, at 415-214-4715 or through our anonymous SecureDrop.
Even as tech companies try to ensure the safety of their workers, then, they need to balance the safety of their billions of users, too. And machine learning alone isn't yet up to that task.