Policy

This Big Tech group tried to redefine violent extremism. It got messy.

The Global Internet Forum to Counter Terrorism announced a series of narrow steps it's taking that underscore just how fraught the job of classifying terror online really is.

Erin Saltman

Erin Saltman is GIFCT's director of programming.

Photo: Paul Morigi/Flickr

A little over a month after the Jan. 6 riot, the tech industry's leading anti-terrorism alliance — a group founded by Facebook, YouTube, Microsoft and Twitter — announced it was seeking ideas for how it could expand its definition of terrorism, which had for years been more or less synonymous with Islamic terrorism. The group, called the Global Internet Forum to Counter Terrorism or GIFCT, had been considering such a shift for at least a year, but the rising threat of domestic extremism, punctuated by the Capitol uprising, made it all the more clear something needed to change.

But after months of interviewing member companies, months of considering academic proposals and months spent mulling the impact of tech platforms on this and other violent events around the world, the group's policies have barely budged. On Monday, in a 177-page report, GIFCT released the first details of its plan, and, well, a radical rethinking of online extremism it is not. Instead, the report lays out a series of narrow steps that underscore just how fraught the job of classifying terror online really is.

Since it was founded in 2017, GIFCT has operated a database that includes known terrorist images and videos, hashed in such a way that member companies can automatically prohibit their users from sharing them. But that database has almost exclusively included content related to terrorist organizations like ISIS and al-Qaeda that have been formally designated as such by the United Nations, creating a massive blind spot for almost all non-Islamic extremism.

Now, GIFCT says it is expanding that database to also include a small, albeit broader, subset of content. That includes hashed PDFs of violent extremist and terrorist manifestos, hashed PDFs of branded terrorist publications and hashed URLs related to terrorist content, which are already being collected by the group Tech Against Terrorism.

Far from a total rewrite of the rules, these changes are admittedly limited, said Erin Saltman, GIFCT's director of programming. "This is incremental, so it can continue to expand from here," Saltman said. "But we also need to expand in ways that we can be transparent, and define it in a way that tech companies can apply it."

Before joining GIFCT, Saltman worked as Facebook's head of counterterrorism and dangerous organizations policy for Europe, the Middle East and Africa. She left the company in January, the same week as the Capitol riot. One month later, GIFCT announced this expansion effort.

Saltman spoke with Protocol about how GIFCT members responded to the group's call to action, what these modest changes can accomplish, and what they can't.

This interview has been lightly edited and condensed.

When I heard that GIFCT was expanding its hashed database, particularly so soon after the Jan. 6 riot, I sort of expected you all to come back with a big list of new additions, including a lot of the extremist organizations and ideologies that have already been banned by big platforms but still aren't recognized by GIFCT. I'm thinking groups like the Oath Keepers or the Boogaloo Bois. Was that naive on my part, or did you also expect there would be a broader set of reforms at the end of this period?

I think that we went in pretty open-minded to ways we could approach expansion, but it had to be first and foremost based off the feedback from our tech company members, because you could say, "I'm going to expand to this," and then if no members actually utilize it, all that hard work, tech and taxonomy that you put towards it means nothing.

Secondly, we really needed multi-stakeholder feedback, because there are polarizing voices in this space. On the one hand, it is very easy to look at the current taxonomy and say, "Having a list-based approach focused on the U.N. has government bias, and it has Islamist extremist terrorist bias." On the other end of the spectrum, you have human rights and civil society activists saying, "Whoa, there is no agreed-upon definition of other forms of terrorism and violent extremism. We do not trust tech to define it, nor do we trust them not to over-censor in this space." There is what lots of people call "lawful, but awful" content out there.

This is not the goal of GIFCT, so we had to navigate those pillars quite heavily. We could do a lot, but it needs to be of utility. You can take two different approaches: You can either lean into list-based approaches — the U.N. list or, to your point, GIFCT could be a list master in and of itself — or you lean into behavior-based approaches. We wanted to expand, seeing where it could be more list-based and where it could be more behavior-based.

You mentioned Oath Keepers and Boogaloo, and gosh, Boogaloo, where do you put it on a spectrum? You could say it's been tied to real-world violence, but some of the Boogaloo movement is ideologically aligned with Black Lives Matter and some of the Boogaloo movement aligns very much with white supremacy and white-power groups. So without that hate-based ideological core, similar to QAnon, some of these groups are very hard to define, and have very loose membership and affiliation structures.

But how important is it to define their ideology as long as you are defining the threat of violence?

Violence and incitement is very broad. GIFCT also wants to ensure we're not going into too much scope creep territory. There are groups where maybe they are extreme and the fringe part of those groups are violent, and it's a big question as to whether or not the violence is core to the group. There are a lot of groups that have violence attached. So there's a lot of concern of over-censorship, especially when there are such close ties to politics. America has the highest free-speech values of any country I can think of.

But if there's an attack, and there's a manifesto tied to any one of those groups, that would go in [the database]. If they have branded content that has violence and incitement associated, and we see xenophobic tropes attached, that's all going in. But it would be hard to say Boogaloo as a whole is a terrorist group or a violent extremist group. You'd get a lot of pushback from that.

You mentioned the need to get buy-in from these tech companies because utilizing GIFCT's database is voluntary, and if you don't have buy-in, then nobody's going to implement whatever you guys come up with. When you were talking to these companies, how much appetite was there among them to have GIFCT broaden its approach and come up with a new set of rules, or are they sort of content to write their own rules?

There's no homogenous internet, so when we say, "the internet" or "platforms," there's a huge amount of diversity. We still want companies to maintain their own terms of service, their own policies and practices. Their visibility of what bad actors look like and feel like on their platforms is going to be very different. It looks very different if I talked to some of our end-to-end encrypted platforms. They have metadata signals. They have some content, if you look at profile pictures or group pictures, but they do not have the content of chats. So that's one side of the spectrum, versus some of our bigger social media platforms that are meant to be public, meant to be loudspeakers, if you will.

Also, lots of our platforms increasingly aren't about user-generated content. So, emails are not explicitly MailChimp user-generated content. Airbnb is not user-generated content focused, so what this looks like to them will look different.

This is part of the reason we went towards hashing URLs, because you might not post content, but people share content that's hosted on third-party platforms quite often. You don't have that source of content, so if you can surface [the URL] as a known signal, it leads you to review and say: "Hey, maybe this is something I need to look more closely at."

But we want to really maintain that this is not a cartel. Everyone has their independent policies and practices, some have more human resources than others. And so what we need to do is constantly have frameworks where they can plug and play in a way that works best for them.

But was there any overt opposition to expanding?

There was no opposition to the concept of expansion, but there's definitely a lot of concern for tech companies that if we expand, they want to make sure they can actually apply what we're expanding to.

In your paper, you said GIFCT needs to consider the size, the manpower, the expertise of its member companies. My understanding of GIFCT has been that it is this entity that combines the expertise of the industry and hashes the content so that then you make that sophistication available to all of the companies in the group. So why would the smallness of any of your members stand in the way of you giving them more to work with?

Giving them more to work with means a couple different things. Giving them more could mean more types of content that we hash, or giving them more could mean different types of technology for them to integrate with. That can be overwhelming, especially the latter. Part of this is about taking hashing as a concept, which is traditionally just focused on images and videos, and realizing that when we look at how the threat is manifesting, we need to go beyond just our lens of image and video. So we've asked about some training for the companies, so that when we give them this bright shiny new thing, they know what to do with it and they feel comfortable with it.

We are going to have to think through: What's the next phase? Are we looking at things like logo detection? Are we looking at better ways to share language translation and language processing? Especially for smaller companies, a lot of the content is not necessarily in English or the one or two languages they do have covered.

I could understand how the technical integration might be challenging, but behind that tool is the definition of the content that you're going to hash. So I guess I'm not totally grasping why the definitional part of it is tougher for a small company.

I think a lot of tech platforms, especially when they are smaller, lean in on what is illegal content. Nation-states everywhere in the world have similar or slightly varying definitions, and they also hold lists. So a company knows: When I take this down, I have the legal backing from a government entity for why this is terrorism and this should be removed.

As soon as you go above and beyond that — and some of the bigger companies have — that is a big task, and a lot of smaller companies don't feel comfortable. One of the immediate things that ends up happening is a question of: Are you over-censoring? So that's why, when we're incrementally building out, we're tying it to overt real-world harm, overt ways that violent extremism manifests. That's not wishy-washy.

So it's more of a comfort-level thing than an ability thing.

We see companies criticized all the time for over-censorship and political bias, even if they say "this was pure hate speech" or "this was purely against our policies." Especially smaller companies maybe don't have the money towards legal fees if they get sued, or don't have the in-house expertise.

When I joined Facebook I was the second hire on the dangerous organizations policy team. That took a big company quite a long time. So if you're a company of 50, your 51st or 100th or even 1,000th hire is usually not a counterterrorism expert. So there's also that discomfort of, if we don't know exactly what we're talking about in-house, we are going to need a big amount of help.

It could be the case in the future that GIFCT decides to be a list-maker, in and of itself, but we would need more staff for that. There are a lot of things to think about, although that would be exciting.

So getting to the actual substance of what you're adding to the database: manifestos, hashed URLs, branded terrorist publications. I think a lot of people would probably be shocked to find out that that's not already part of the database. So, having worked at Facebook, can you give a sense of how widespread that specific type of content was, or how helpful this expansion is going to be to addressing the overall problem with violent extremism online, which obviously extends far beyond manifestos and official publications?

The URLs are a really interesting one. We have tested URL-sharing before. URLs are inherently tied to personally-identifiable information. We're very wary of sharing private data. But when you hash something, it can act as a signal.

The bad content is usually not hosted on Facebook or Twitter. It is shared via a URL, and you are not psychic as a tech company. You do not inherently know what that URL links to. A lot of moderation teams are told to avoid click-throughs, because you don't know if malware is attached, and you don't have the time and scale to monitor third-party platforms. So by hashing URLs, we're giving them a wider net. That's a big deal, especially for the less-social media sites, or even things like comments. You might have benign pictures in a post, and then all of a sudden, it's the comments underneath it that leads you down a rabbit hole and are sharing URLs that lead you off-platform.

As for the more controversial or illegal content on manifestos, this is very much getting at the pointy edge of the subcultures. It is increasingly trendy for certain attackers to release lengthy manifestos just before carrying out an attack. There are huge issues around those going viral within supportive subcultures, and they are coveted by certain groups and pointed to and referenced. The Christchurch attacker, his manifesto was referenced elsewhere by individuals that then went and carried out violence. So we know manifestos are a problem. It also, a lot of times, get us to the white supremacy and neo-Nazi groups. So if you're using that as a signal, it often leads you to, who is posting that?

And then branded content: We can start with a U.N. list, but Siege and other forms of neo-Nazi zines are very much in the subcultures online, and that's something that we can work with experts and researchers on to add and incorporate in a public way to this database.

So it might not be about the fringy "lawful, but awful" content. But it really homes in on how these core members of violent extremist groups manifest and share online.

Do you think that these changes get us any further from the bias that you guys write about toward Islamic terrorism that has dominated this field and certainly dominated the database for a while? Do these changes equalize it, or not quite?

I don't think it'll really be called equal. It gets us out of a list-based approach, in some respects, and looks at the behavior. And I think that's important. Government definitions of terrorism in theory are agnostic to any one religion or ideology. It's about the violence. It's about the target and the motive. And yet, the lists don't manifest that way. The lists are quite biased. And that's not just the U.S., that's all over the world. So this allows us to take a more holistic and behavior-based approach.

In all these different academic proposals that you guys collected, one that I thought was interesting was the idea that GIFCT should use its convening power to try to get member companies to standardize their terms of service and create some kind of unified list. The researchers basically said, "We don't propose this lightly." It would definitely cause some reputational and legal and security risks, but the other risks are greater. So I wonder what you think about that idea.

That's something for us to consider in the long term, for sure, but again, it can't be done without the right staffing and due diligence behind the scenes.

I think that there's a big difference between having standardized terms of service versus having GIFCT hold a list that goes above and beyond government lists. Standardized terms of service is like asking governments to have a formal approved definition of terrorism. One reason the United Nations does not have an agreed-upon definition of terrorism is because governments couldn't decide whether or not a government could be a terrorist entity. So even at the U.N. level, you could not have an agreed-upon definition of terrorism, although they do still have frameworks and designation lists. We're seeing companies being held to the same standard or above, being asked to go above and beyond what governments are able to do, which makes companies a little wary to be that powerful gatekeeper.

I think the list-based approach is something that we need to consider, but it has to be done in a way that is definable, scaleable and explainable. We can't just say: "Oh, we're expanding this just because of trends in the U.S. or trends in a couple Western countries."

That means there's a lot of consideration behind the scenes, there's some weird groups out there. I highly recommend looking at the Mongolian Nazi Party green movement. It's a strange one. The founder owns a lingerie shop. It's weird. So when you do open that door, you need to not have geographic biases when we're meant to be a global entity.

One of the things you wrote that was interesting was that the rise of live audio platforms was going to make a lot of this harder. Facebook is just coming out with one. Twitter is, too. Obviously, there's Clubhouse. Do you think it's irresponsible for these companies to be forging ahead with this technology, knowing what they know about how other mediums have been abused? I mean, it's all well and good for Mark Zuckerberg to say, "I'd never expected this to happen when I started Facebook," but now you know what's going to happen. So is this irresponsible?

My job has always been to work with tech companies when they say, "We created this bright, shiny new thing," and my job has always been to say: "Here's 101 ways that that's going to be used horribly." And that does not mean it shouldn't be done. It really depends on what you think you're solving for and if you have a safety-by-design approach. Everything could be misappropriated or re-appropriated for bad. And if you're always solving for that low-prevalence, high-risk, we would have zero innovation.

Your last week at Facebook was the week of the Jan. 6 riot. That was a real turning point in how people in the U.S. at least were talking about and understanding the amount that had been done to thwart domestic extremism versus foreign extremism, and obviously there's been tons of attention in the court cases around what the platforms missed and what the FBI missed. So from your perspective at Facebook, where did the companies' defenses fall short? Where were the biggest blind spots, and do you think any of these changes will do anything to address those shortcomings?

There is an interesting nexus between what we call terrorism and violent extremism versus what we start calling inauthentic coordinated behavior or purely violence and incitement. So you see this overlap diagram of different harm types coming to the fold. Sometimes it's not as much about being able to clearly say, "This is a violent extremist group," as saying, "OK, we're now seeing violence and incitement in language and in certain chat threads, and that's a different type of risk."

Government [calls] it "left of boom." It's very hard for governments to accurately get at entities "left of boom" before there is real-world violence. Just like Charlottesville, on Jan. 6, there was a huge push to do more. Before that, there's usually a huge push to do less, and criticism of over-censorship and going above and beyond [the] government.

Most governments are not good at designating domestic entities, that's not just the U.S. And so tech companies are still kind of saying: "OK, well for the individuals that carried out violence, am I focused on the violent individuals, or am I taking a group stance which goes way above and beyond what the government is willing to do?" And that still remains an issue, but there are turning points.

I think a lot changed after the Christchurch attacks. Before that, I think a lot changed with Anders Behring [Breivik] and the Norway attacks, of which we have the anniversary right now. When I was tracking the Norway attacks, before they knew who the attacker was, all the headlines globally said: "Norway's 9/11," "Al Qaeda attack in Norway," and assumed terrorism. As soon as they realized it was a white individual, not linked to Islamist extremist terrorism, all the language changed to "lone gunman," and very few outlets called it terrorism.

It was terrorism. So we also, globally, have an issue of typecasting what it is to be a terrorist. And that is not just a tech company issue. That's something that society needs to come to terms with. We're much better at labeling terrorism when it is an othering process. It's difficult to label terrorism when it is us.

Enterprise

Why foundation models in AI need to be released responsibly

Foundation models like GPT-3 and DALL-E are changing AI forever. We urgently need to develop community norms that guarantee research access and help guide the future of AI responsibly.

Releasing new foundation models doesn’t have to be an all or nothing proposition.

Illustration: sorbetto/DigitalVision Vectors

Percy Liang is director of the Center for Research on Foundation Models, a faculty affiliate at the Stanford Institute for Human-Centered AI and an associate professor of Computer Science at Stanford University.

Humans are not very good at forecasting the future, especially when it comes to technology.

Keep Reading Show less
Percy Liang
Percy Liang is Director of the Center for Research on Foundation Models, a Faculty Affiliate at the Stanford Institute for Human-Centered AI, and an Associate Professor of Computer Science at Stanford University.

Every day, millions of us press the “order” button on our favorite coffee store's mobile application: Our chosen brew will be on the counter when we arrive. It’s a personalized, seamless experience that we have all come to expect. What we don’t know is what’s happening behind the scenes. The mobile application is sourcing data from a database that stores information about each customer and what their favorite coffee drinks are. It is also leveraging event-streaming data in real time to ensure the ingredients for your personal coffee are in supply at your local store.

Applications like this power our daily lives, and if they can’t access massive amounts of data stored in a database as well as stream data “in motion” instantaneously, you — and millions of customers — won’t have these in-the-moment experiences.

Keep Reading Show less
Jennifer Goforth Gregory
Jennifer Goforth Gregory has worked in the B2B technology industry for over 20 years. As a freelance writer she writes for top technology brands, including IBM, HPE, Adobe, AT&T, Verizon, Epson, Oracle, Intel and Square. She specializes in a wide range of technology, such as AI, IoT, cloud, cybersecurity, and CX. Jennifer also wrote a bestselling book The Freelance Content Marketing Writer to help other writers launch a high earning freelance business.
Climate

The West’s drought could bring about a data center reckoning

When it comes to water use, data centers are the tech industry’s secret water hogs — and they could soon come under increased scrutiny.

Lake Mead, North America's largest artificial reservoir, has dropped to about 1,052 feet above sea level, the lowest it's been since being filled in 1937.

Photo: Mario Tama/Getty Images

The West is parched, and getting more so by the day. Lake Mead — the country’s largest reservoir — is nearing “dead pool” levels, meaning it may soon be too low to flow downstream. The entirety of the Four Corners plus California is mired in megadrought.

Amid this desiccation, hundreds of the country’s data centers use vast amounts of water to hum along. Dozens cluster around major metro centers, including those with mandatory or voluntary water restrictions in place to curtail residential and agricultural use.

Keep Reading Show less
Lisa Martine Jenkins

Lisa Martine Jenkins is a senior reporter at Protocol covering climate. Lisa previously wrote for Morning Consult, Chemical Watch and the Associated Press. Lisa is currently based in Brooklyn, and is originally from the Bay Area. Find her on Twitter ( @l_m_j_) or reach out via email (ljenkins@protocol.com).

Workplace

Indeed is hiring 4,000 workers despite industry layoffs

Indeed’s new CPO, Priscilla Koranteng, spoke to Protocol about her first 100 days in the role and the changing nature of HR.

"[Y]ou are serving the people. And everything that's happening around us in the world is … impacting their professional lives."

Image: Protocol

Priscilla Koranteng's plans are ambitious. Koranteng, who was appointed chief people officer of Indeed in June, has already enhanced the company’s abortion travel policies and reinforced its goal to hire 4,000 people in 2022.

She’s joined the HR tech company in a time when many other tech companies are enacting layoffs and cutbacks, but said she sees this precarious time as an opportunity for growth companies to really get ahead. Koranteng, who comes from an HR and diversity VP role at Kellogg, is working on embedding her hybrid set of expertise in her new role at Indeed.

Keep Reading Show less
Amber Burton

Amber Burton (@amberbburton) is a reporter at Protocol. Previously, she covered personal finance and diversity in business at The Wall Street Journal. She earned an M.S. in Strategic Communications from Columbia University and B.A. in English and Journalism from Wake Forest University. She lives in North Carolina.

Climate

New Jersey could become an ocean energy hub

A first-in-the-nation bill would support wave and tidal energy as a way to meet the Garden State's climate goals.

Technological challenges mean wave and tidal power remain generally more expensive than their other renewable counterparts. But government support could help spur more innovation that brings down cost.

Photo: Jeremy Bishop via Unsplash

Move over, solar and wind. There’s a new kid on the renewable energy block: waves and tides.

Harnessing the ocean’s power is still in its early stages, but the industry is poised for a big legislative boost, with the potential for real investment down the line.

Keep Reading Show less
Lisa Martine Jenkins

Lisa Martine Jenkins is a senior reporter at Protocol covering climate. Lisa previously wrote for Morning Consult, Chemical Watch and the Associated Press. Lisa is currently based in Brooklyn, and is originally from the Bay Area. Find her on Twitter ( @l_m_j_) or reach out via email (ljenkins@protocol.com).

Latest Stories
Bulletins