This Big Tech group tried to redefine violent extremism. It got messy.

The Global Internet Forum to Counter Terrorism announced a series of narrow steps it's taking that underscore just how fraught the job of classifying terror online really is.

Erin Saltman

Erin Saltman is GIFCT's director of programming.

Photo: Paul Morigi/Flickr

A little over a month after the Jan. 6 riot, the tech industry's leading anti-terrorism alliance — a group founded by Facebook, YouTube, Microsoft and Twitter — announced it was seeking ideas for how it could expand its definition of terrorism, which had for years been more or less synonymous with Islamic terrorism. The group, called the Global Internet Forum to Counter Terrorism or GIFCT, had been considering such a shift for at least a year, but the rising threat of domestic extremism, punctuated by the Capitol uprising, made it all the more clear something needed to change.

But after months of interviewing member companies, months of considering academic proposals and months spent mulling the impact of tech platforms on this and other violent events around the world, the group's policies have barely budged. On Monday, in a 177-page report, GIFCT released the first details of its plan, and, well, a radical rethinking of online extremism it is not. Instead, the report lays out a series of narrow steps that underscore just how fraught the job of classifying terror online really is.

Since it was founded in 2017, GIFCT has operated a database that includes known terrorist images and videos, hashed in such a way that member companies can automatically prohibit their users from sharing them. But that database has almost exclusively included content related to terrorist organizations like ISIS and al-Qaeda that have been formally designated as such by the United Nations, creating a massive blind spot for almost all non-Islamic extremism.

Now, GIFCT says it is expanding that database to also include a small, albeit broader, subset of content. That includes hashed PDFs of violent extremist and terrorist manifestos, hashed PDFs of branded terrorist publications and hashed URLs related to terrorist content, which are already being collected by the group Tech Against Terrorism.

Far from a total rewrite of the rules, these changes are admittedly limited, said Erin Saltman, GIFCT's director of programming. "This is incremental, so it can continue to expand from here," Saltman said. "But we also need to expand in ways that we can be transparent, and define it in a way that tech companies can apply it."

Before joining GIFCT, Saltman worked as Facebook's head of counterterrorism and dangerous organizations policy for Europe, the Middle East and Africa. She left the company in January, the same week as the Capitol riot. One month later, GIFCT announced this expansion effort.

Saltman spoke with Protocol about how GIFCT members responded to the group's call to action, what these modest changes can accomplish, and what they can't.

This interview has been lightly edited and condensed.

When I heard that GIFCT was expanding its hashed database, particularly so soon after the Jan. 6 riot, I sort of expected you all to come back with a big list of new additions, including a lot of the extremist organizations and ideologies that have already been banned by big platforms but still aren't recognized by GIFCT. I'm thinking groups like the Oath Keepers or the Boogaloo Bois. Was that naive on my part, or did you also expect there would be a broader set of reforms at the end of this period?

I think that we went in pretty open-minded to ways we could approach expansion, but it had to be first and foremost based off the feedback from our tech company members, because you could say, "I'm going to expand to this," and then if no members actually utilize it, all that hard work, tech and taxonomy that you put towards it means nothing.

Secondly, we really needed multi-stakeholder feedback, because there are polarizing voices in this space. On the one hand, it is very easy to look at the current taxonomy and say, "Having a list-based approach focused on the U.N. has government bias, and it has Islamist extremist terrorist bias." On the other end of the spectrum, you have human rights and civil society activists saying, "Whoa, there is no agreed-upon definition of other forms of terrorism and violent extremism. We do not trust tech to define it, nor do we trust them not to over-censor in this space." There is what lots of people call "lawful, but awful" content out there.

This is not the goal of GIFCT, so we had to navigate those pillars quite heavily. We could do a lot, but it needs to be of utility. You can take two different approaches: You can either lean into list-based approaches — the U.N. list or, to your point, GIFCT could be a list master in and of itself — or you lean into behavior-based approaches. We wanted to expand, seeing where it could be more list-based and where it could be more behavior-based.

You mentioned Oath Keepers and Boogaloo, and gosh, Boogaloo, where do you put it on a spectrum? You could say it's been tied to real-world violence, but some of the Boogaloo movement is ideologically aligned with Black Lives Matter and some of the Boogaloo movement aligns very much with white supremacy and white-power groups. So without that hate-based ideological core, similar to QAnon, some of these groups are very hard to define, and have very loose membership and affiliation structures.

But how important is it to define their ideology as long as you are defining the threat of violence?

Violence and incitement is very broad. GIFCT also wants to ensure we're not going into too much scope creep territory. There are groups where maybe they are extreme and the fringe part of those groups are violent, and it's a big question as to whether or not the violence is core to the group. There are a lot of groups that have violence attached. So there's a lot of concern of over-censorship, especially when there are such close ties to politics. America has the highest free-speech values of any country I can think of.

But if there's an attack, and there's a manifesto tied to any one of those groups, that would go in [the database]. If they have branded content that has violence and incitement associated, and we see xenophobic tropes attached, that's all going in. But it would be hard to say Boogaloo as a whole is a terrorist group or a violent extremist group. You'd get a lot of pushback from that.

You mentioned the need to get buy-in from these tech companies because utilizing GIFCT's database is voluntary, and if you don't have buy-in, then nobody's going to implement whatever you guys come up with. When you were talking to these companies, how much appetite was there among them to have GIFCT broaden its approach and come up with a new set of rules, or are they sort of content to write their own rules?

There's no homogenous internet, so when we say, "the internet" or "platforms," there's a huge amount of diversity. We still want companies to maintain their own terms of service, their own policies and practices. Their visibility of what bad actors look like and feel like on their platforms is going to be very different. It looks very different if I talked to some of our end-to-end encrypted platforms. They have metadata signals. They have some content, if you look at profile pictures or group pictures, but they do not have the content of chats. So that's one side of the spectrum, versus some of our bigger social media platforms that are meant to be public, meant to be loudspeakers, if you will.

Also, lots of our platforms increasingly aren't about user-generated content. So, emails are not explicitly MailChimp user-generated content. Airbnb is not user-generated content focused, so what this looks like to them will look different.

This is part of the reason we went towards hashing URLs, because you might not post content, but people share content that's hosted on third-party platforms quite often. You don't have that source of content, so if you can surface [the URL] as a known signal, it leads you to review and say: "Hey, maybe this is something I need to look more closely at."

But we want to really maintain that this is not a cartel. Everyone has their independent policies and practices, some have more human resources than others. And so what we need to do is constantly have frameworks where they can plug and play in a way that works best for them.

But was there any overt opposition to expanding?

There was no opposition to the concept of expansion, but there's definitely a lot of concern for tech companies that if we expand, they want to make sure they can actually apply what we're expanding to.

In your paper, you said GIFCT needs to consider the size, the manpower, the expertise of its member companies. My understanding of GIFCT has been that it is this entity that combines the expertise of the industry and hashes the content so that then you make that sophistication available to all of the companies in the group. So why would the smallness of any of your members stand in the way of you giving them more to work with?

Giving them more to work with means a couple different things. Giving them more could mean more types of content that we hash, or giving them more could mean different types of technology for them to integrate with. That can be overwhelming, especially the latter. Part of this is about taking hashing as a concept, which is traditionally just focused on images and videos, and realizing that when we look at how the threat is manifesting, we need to go beyond just our lens of image and video. So we've asked about some training for the companies, so that when we give them this bright shiny new thing, they know what to do with it and they feel comfortable with it.

We are going to have to think through: What's the next phase? Are we looking at things like logo detection? Are we looking at better ways to share language translation and language processing? Especially for smaller companies, a lot of the content is not necessarily in English or the one or two languages they do have covered.

I could understand how the technical integration might be challenging, but behind that tool is the definition of the content that you're going to hash. So I guess I'm not totally grasping why the definitional part of it is tougher for a small company.

I think a lot of tech platforms, especially when they are smaller, lean in on what is illegal content. Nation-states everywhere in the world have similar or slightly varying definitions, and they also hold lists. So a company knows: When I take this down, I have the legal backing from a government entity for why this is terrorism and this should be removed.

As soon as you go above and beyond that — and some of the bigger companies have — that is a big task, and a lot of smaller companies don't feel comfortable. One of the immediate things that ends up happening is a question of: Are you over-censoring? So that's why, when we're incrementally building out, we're tying it to overt real-world harm, overt ways that violent extremism manifests. That's not wishy-washy.

So it's more of a comfort-level thing than an ability thing.

We see companies criticized all the time for over-censorship and political bias, even if they say "this was pure hate speech" or "this was purely against our policies." Especially smaller companies maybe don't have the money towards legal fees if they get sued, or don't have the in-house expertise.

When I joined Facebook I was the second hire on the dangerous organizations policy team. That took a big company quite a long time. So if you're a company of 50, your 51st or 100th or even 1,000th hire is usually not a counterterrorism expert. So there's also that discomfort of, if we don't know exactly what we're talking about in-house, we are going to need a big amount of help.

It could be the case in the future that GIFCT decides to be a list-maker, in and of itself, but we would need more staff for that. There are a lot of things to think about, although that would be exciting.

So getting to the actual substance of what you're adding to the database: manifestos, hashed URLs, branded terrorist publications. I think a lot of people would probably be shocked to find out that that's not already part of the database. So, having worked at Facebook, can you give a sense of how widespread that specific type of content was, or how helpful this expansion is going to be to addressing the overall problem with violent extremism online, which obviously extends far beyond manifestos and official publications?

The URLs are a really interesting one. We have tested URL-sharing before. URLs are inherently tied to personally-identifiable information. We're very wary of sharing private data. But when you hash something, it can act as a signal.

The bad content is usually not hosted on Facebook or Twitter. It is shared via a URL, and you are not psychic as a tech company. You do not inherently know what that URL links to. A lot of moderation teams are told to avoid click-throughs, because you don't know if malware is attached, and you don't have the time and scale to monitor third-party platforms. So by hashing URLs, we're giving them a wider net. That's a big deal, especially for the less-social media sites, or even things like comments. You might have benign pictures in a post, and then all of a sudden, it's the comments underneath it that leads you down a rabbit hole and are sharing URLs that lead you off-platform.

As for the more controversial or illegal content on manifestos, this is very much getting at the pointy edge of the subcultures. It is increasingly trendy for certain attackers to release lengthy manifestos just before carrying out an attack. There are huge issues around those going viral within supportive subcultures, and they are coveted by certain groups and pointed to and referenced. The Christchurch attacker, his manifesto was referenced elsewhere by individuals that then went and carried out violence. So we know manifestos are a problem. It also, a lot of times, get us to the white supremacy and neo-Nazi groups. So if you're using that as a signal, it often leads you to, who is posting that?

And then branded content: We can start with a U.N. list, but Siege and other forms of neo-Nazi zines are very much in the subcultures online, and that's something that we can work with experts and researchers on to add and incorporate in a public way to this database.

So it might not be about the fringy "lawful, but awful" content. But it really homes in on how these core members of violent extremist groups manifest and share online.

Do you think that these changes get us any further from the bias that you guys write about toward Islamic terrorism that has dominated this field and certainly dominated the database for a while? Do these changes equalize it, or not quite?

I don't think it'll really be called equal. It gets us out of a list-based approach, in some respects, and looks at the behavior. And I think that's important. Government definitions of terrorism in theory are agnostic to any one religion or ideology. It's about the violence. It's about the target and the motive. And yet, the lists don't manifest that way. The lists are quite biased. And that's not just the U.S., that's all over the world. So this allows us to take a more holistic and behavior-based approach.

In all these different academic proposals that you guys collected, one that I thought was interesting was the idea that GIFCT should use its convening power to try to get member companies to standardize their terms of service and create some kind of unified list. The researchers basically said, "We don't propose this lightly." It would definitely cause some reputational and legal and security risks, but the other risks are greater. So I wonder what you think about that idea.

That's something for us to consider in the long term, for sure, but again, it can't be done without the right staffing and due diligence behind the scenes.

I think that there's a big difference between having standardized terms of service versus having GIFCT hold a list that goes above and beyond government lists. Standardized terms of service is like asking governments to have a formal approved definition of terrorism. One reason the United Nations does not have an agreed-upon definition of terrorism is because governments couldn't decide whether or not a government could be a terrorist entity. So even at the U.N. level, you could not have an agreed-upon definition of terrorism, although they do still have frameworks and designation lists. We're seeing companies being held to the same standard or above, being asked to go above and beyond what governments are able to do, which makes companies a little wary to be that powerful gatekeeper.

I think the list-based approach is something that we need to consider, but it has to be done in a way that is definable, scaleable and explainable. We can't just say: "Oh, we're expanding this just because of trends in the U.S. or trends in a couple Western countries."

That means there's a lot of consideration behind the scenes, there's some weird groups out there. I highly recommend looking at the Mongolian Nazi Party green movement. It's a strange one. The founder owns a lingerie shop. It's weird. So when you do open that door, you need to not have geographic biases when we're meant to be a global entity.

One of the things you wrote that was interesting was that the rise of live audio platforms was going to make a lot of this harder. Facebook is just coming out with one. Twitter is, too. Obviously, there's Clubhouse. Do you think it's irresponsible for these companies to be forging ahead with this technology, knowing what they know about how other mediums have been abused? I mean, it's all well and good for Mark Zuckerberg to say, "I'd never expected this to happen when I started Facebook," but now you know what's going to happen. So is this irresponsible?

My job has always been to work with tech companies when they say, "We created this bright, shiny new thing," and my job has always been to say: "Here's 101 ways that that's going to be used horribly." And that does not mean it shouldn't be done. It really depends on what you think you're solving for and if you have a safety-by-design approach. Everything could be misappropriated or re-appropriated for bad. And if you're always solving for that low-prevalence, high-risk, we would have zero innovation.

Your last week at Facebook was the week of the Jan. 6 riot. That was a real turning point in how people in the U.S. at least were talking about and understanding the amount that had been done to thwart domestic extremism versus foreign extremism, and obviously there's been tons of attention in the court cases around what the platforms missed and what the FBI missed. So from your perspective at Facebook, where did the companies' defenses fall short? Where were the biggest blind spots, and do you think any of these changes will do anything to address those shortcomings?

There is an interesting nexus between what we call terrorism and violent extremism versus what we start calling inauthentic coordinated behavior or purely violence and incitement. So you see this overlap diagram of different harm types coming to the fold. Sometimes it's not as much about being able to clearly say, "This is a violent extremist group," as saying, "OK, we're now seeing violence and incitement in language and in certain chat threads, and that's a different type of risk."

Government [calls] it "left of boom." It's very hard for governments to accurately get at entities "left of boom" before there is real-world violence. Just like Charlottesville, on Jan. 6, there was a huge push to do more. Before that, there's usually a huge push to do less, and criticism of over-censorship and going above and beyond [the] government.

Most governments are not good at designating domestic entities, that's not just the U.S. And so tech companies are still kind of saying: "OK, well for the individuals that carried out violence, am I focused on the violent individuals, or am I taking a group stance which goes way above and beyond what the government is willing to do?" And that still remains an issue, but there are turning points.

I think a lot changed after the Christchurch attacks. Before that, I think a lot changed with Anders Behring [Breivik] and the Norway attacks, of which we have the anniversary right now. When I was tracking the Norway attacks, before they knew who the attacker was, all the headlines globally said: "Norway's 9/11," "Al Qaeda attack in Norway," and assumed terrorism. As soon as they realized it was a white individual, not linked to Islamist extremist terrorism, all the language changed to "lone gunman," and very few outlets called it terrorism.

It was terrorism. So we also, globally, have an issue of typecasting what it is to be a terrorist. And that is not just a tech company issue. That's something that society needs to come to terms with. We're much better at labeling terrorism when it is an othering process. It's difficult to label terrorism when it is us.


Judge Zia Faruqui is trying to teach you crypto, one ‘SNL’ reference at a time

His decisions on major cryptocurrency cases have quoted "The Big Lebowski," "SNL," and "Dr. Strangelove." That’s because he wants you — yes, you — to read them.

The ways Zia Faruqui (right) has weighed on cases that have come before him can give lawyers clues as to what legal frameworks will pass muster.

Photo: Carolyn Van Houten/The Washington Post via Getty Images

“Cryptocurrency and related software analytics tools are ‘The wave of the future, Dude. One hundred percent electronic.’”

That’s not a quote from "The Big Lebowski" — at least, not directly. It’s a quote from a Washington, D.C., district court memorandum opinion on the role cryptocurrency analytics tools can play in government investigations. The author is Magistrate Judge Zia Faruqui.

Keep ReadingShow less
Veronica Irwin

Veronica Irwin (@vronirwin) is a San Francisco-based reporter at Protocol covering fintech. Previously she was at the San Francisco Examiner, covering tech from a hyper-local angle. Before that, her byline was featured in SF Weekly, The Nation, Techworker, Ms. Magazine and The Frisc.

The financial technology transformation is driving competition, creating consumer choice, and shaping the future of finance. Hear from seven fintech leaders who are reshaping the future of finance, and join the inaugural Financial Technology Association Fintech Summit to learn more.

Keep ReadingShow less
The Financial Technology Association (FTA) represents industry leaders shaping the future of finance. We champion the power of technology-centered financial services and advocate for the modernization of financial regulation to support inclusion and responsible innovation.

AWS CEO: The cloud isn’t just about technology

As AWS preps for its annual re:Invent conference, Adam Selipsky talks product strategy, support for hybrid environments, and the value of the cloud in uncertain economic times.

Photo: Noah Berger/Getty Images for Amazon Web Services

AWS is gearing up for re:Invent, its annual cloud computing conference where announcements this year are expected to focus on its end-to-end data strategy and delivering new industry-specific services.

It will be the second re:Invent with CEO Adam Selipsky as leader of the industry’s largest cloud provider after his return last year to AWS from data visualization company Tableau Software.

Keep ReadingShow less
Donna Goodison

Donna Goodison (@dgoodison) is Protocol's senior reporter focusing on enterprise infrastructure technology, from the 'Big 3' cloud computing providers to data centers. She previously covered the public cloud at CRN after 15 years as a business reporter for the Boston Herald. Based in Massachusetts, she also has worked as a Boston Globe freelancer, business reporter at the Boston Business Journal and real estate reporter at Banker & Tradesman after toiling at weekly newspapers.

Image: Protocol

We launched Protocol in February 2020 to cover the evolving power center of tech. It is with deep sadness that just under three years later, we are winding down the publication.

As of today, we will not publish any more stories. All of our newsletters, apart from our flagship, Source Code, will no longer be sent. Source Code will be published and sent for the next few weeks, but it will also close down in December.

Keep ReadingShow less
Bennett Richardson

Bennett Richardson ( @bennettrich) is the president of Protocol. Prior to joining Protocol in 2019, Bennett was executive director of global strategic partnerships at POLITICO, where he led strategic growth efforts including POLITICO's European expansion in Brussels and POLITICO's creative agency POLITICO Focus during his six years with the company. Prior to POLITICO, Bennett was co-founder and CMO of Hinge, the mobile dating company recently acquired by Match Group. Bennett began his career in digital and social brand marketing working with major brands across tech, energy, and health care at leading marketing and communications agencies including Edelman and GMMB. Bennett is originally from Portland, Maine, and received his bachelor's degree from Colgate University.


Why large enterprises struggle to find suitable platforms for MLops

As companies expand their use of AI beyond running just a few machine learning models, and as larger enterprises go from deploying hundreds of models to thousands and even millions of models, ML practitioners say that they have yet to find what they need from prepackaged MLops systems.

As companies expand their use of AI beyond running just a few machine learning models, ML practitioners say that they have yet to find what they need from prepackaged MLops systems.

Photo: artpartner-images via Getty Images

On any given day, Lily AI runs hundreds of machine learning models using computer vision and natural language processing that are customized for its retail and ecommerce clients to make website product recommendations, forecast demand, and plan merchandising. But this spring when the company was in the market for a machine learning operations platform to manage its expanding model roster, it wasn’t easy to find a suitable off-the-shelf system that could handle such a large number of models in deployment while also meeting other criteria.

Some MLops platforms are not well-suited for maintaining even more than 10 machine learning models when it comes to keeping track of data, navigating their user interfaces, or reporting capabilities, Matthew Nokleby, machine learning manager for Lily AI’s product intelligence team, told Protocol earlier this year. “The duct tape starts to show,” he said.

Keep ReadingShow less
Kate Kaye

Kate Kaye is an award-winning multimedia reporter digging deep and telling print, digital and audio stories. She covers AI and data for Protocol. Her reporting on AI and tech ethics issues has been published in OneZero, Fast Company, MIT Technology Review, CityLab, Ad Age and Digiday and heard on NPR. Kate is the creator of RedTailMedia.org and is the author of "Campaign '08: A Turning Point for Digital Media," a book about how the 2008 presidential campaigns used digital media and data.

Latest Stories