As governments turn to tech companies to track the spread of COVID-19, Facebook will share even more anonymized data about its users with researchers in order to show how people move around, how they're connected, and where the virus is likely to spread.
In a blog post published Monday, Facebook announced a series of new mapping tools that will allow approved researchers to analyze mobility patterns and connections among people in different regions. The company also announced that starting Monday, it will begin showing select users in the U.S. a coronavirus survey from Carnegie Mellon University at the top of their News Feeds. Users can voluntarily complete the survey off of Facebook in order to contribute to Carnegie Mellon's research. That survey asks people to report their symptoms and their ZIP codes to help researchers understand where people who might have the virus are located, even if they're not being seen by a doctor.
"The hope is with Facebook's reach and also the fact that people are on Facebook a lot, we're going to get lots of respondents and some really fine-grain geographic information about symptomatic infections," said Ryan Tibshirani, an associate professor of statistics and machine learning at Carnegie Mellon, who is leading the project. Tibshirani stressed that the partnership doesn't give him any access to Facebook users' personal data, and Facebook doesn't get access to their survey results.
Facebook has been working on disease prevention research since 2019 when it launched a series of mapping tools designed to help researchers study things like population density in outbreak areas and mobility patterns in a given region. Since the COVID-19 outbreak began, researchers have been using Facebook's anonymized, aggregated mobility data to study whether people are actually practicing social distancing. This data comes from Facebook users who have location services enabled on their phones. Last week, Google also began releasing state-level and country-level data on social distancing patterns.
But this new batch of mapping tools from Facebook will give researchers insight not just into where people are moving, but how they're connected, as well. In addition to mapping out movement trends in a region or county, Facebook is also offering researchers two other tools based on its user data: colocation maps and a social connectedness index.
Colocation maps reveal the probability that people from one place might come into contact with people from another place. That way, researchers can tell if people from an area with a large outbreak are likely to cross paths with people from another area with fewer cases. As Protocol previously reported, Facebook began piloting these maps in Hong Kong in direct response to the coronavirus outbreak there. Meanwhile, the social connectedness index shows friendship patterns across different states and countries, so researchers can better understand where the virus might spread next.
These tools are already being put to use by a global coalition of researchers called the COVID-19 Mobility Data Network. This network pairs researchers with city, state and even national governments to help them analyze local mobility patterns using location data. This work has already yielded important insights on the successes and failures of social distancing, said Andrew Schroeder, vice president of research and analysis at the nonprofit Direct Relief.
An example of a social connectedness map.Image: Courtesy of Facebook
In California, for instance, the data shows that San Francisco has seen the greatest rate in reduction in mobility and the highest rate of people staying home in the whole state. Meanwhile, areas like Riverside and San Bernardino have seen smaller decreases in mobility. Schroeder chalked that up to the simple fact that people have different types of jobs in those places. While San Francisco's tech workforce can easily work from home, Riverside and San Bernardino are logistics hubs, where people can't.
Schroeder said that although it's too early to tell for sure, San Francisco's early adherence to social distancing may be one reason why it's emerged as a "success story" in terms of disease transmission.
"You saw early case detections happen there, but nevertheless, the rate of increase in San Francisco has been, on average, lower than other cities of its size," he said. "Does that have to do with social distancing? Yeah, probably."
In addition to these new mapping tools that Schroeder and others are using, Facebook is also helping Carnegie Mellon's researchers with their survey design. Though no data is changing hands between Facebook and Carnegie Mellon, Facebook is helping the researchers ensure they're sampling a representative group by assigning each survey taker a random ID. Once someone has completed the survey, Carnegie Mellon will send Facebook that person's specific ID. Facebook will then tell the researchers how they should weigh that response in order to correct for sampling bias.
This structure aims to address the growing tension between preserving people's privacy and ensuring governments and scientists have access to the data they need to track the virus. Countries like South Korea and China have deployed extensive, tech-enabled surveillance to monitor the movements of people who have developed COVID-19. In the United States, lawmakers have cautioned against embracing such tactics.
"I don't want a situation where we have data on individuals, and people are showing up and knocking on their doors with thermometers or testing or figuring out where they're traveling to as individuals," Rep. Ro Khanna recently told Protocol. "That's a surveillance state like China we should completely resist."
Meanwhile, last week, a group of lawmakers led by New Jersey Sen. Bob Menendez wrote to Verily, a health care company owned by Alphabet, asking for information on what will happen to all the data it's collecting on people with COVID-19 symptoms.
Facebook, of course, has not always been so careful with users' personal information. If this crisis had struck before 2014, researchers could have built Facebook apps to conduct their surveys and scraped all of the respondents' data, as well as the data of their friends. That's how a University of Cambridge researcher ended up selling data on millions of unwitting Americans to the political consulting firm Cambridge Analytica before the 2016 U.S. election.
But the resounding blowback to that scandal has forced Facebook to rethink its partnerships, even with scientists motivated to stop a global pandemic. Now, Facebook maintains that all of the data it's releasing is fully anonymized and aggregated, so that no single individual's anonymous location data can be reattached to their identity.
"Facebook and the wider technology industry can — and must — continue to find innovative ways to help health experts and authorities respond to this crisis, without trading off privacy," KX Jin, Facebook's head of health, and Laura McGorman, policy lead for Facebook's Data for Good program, wrote in the company blog post.
Carnegie Mellon is running a similar survey program with Google, but in that case, Google is surveying users itself through its Opinion Rewards app, which offers app store credit in exchange for survey answers. But Tibshirani said the Facebook survey gives him more flexibility in asking health-related questions, since he's running the survey himself.
Get in touch with us: Share information securely with Protocol via encrypted Signal or WhatsApp message, at 415-214-4715 or through our anonymous SecureDrop.
Carnegie Mellon plans to share the survey results with the University of Maryland and several other schools that Tibshirani said are awaiting Facebook's approval. Once he receives enough survey results, he said he hopes to release aggregate data on symptoms at a county-level. He said that may help researchers and lawmakers who are currently blind to where people might be infected before they reach a hospital or testing center.
"I am very, very excited about the potential of this data," Tibshirani said. "It has the potential for enormous impact."
Schroeder agreed the data Facebook and other companies have provided to this effort has been critical to understanding the virus' spread. But as more companies offer up their location data to the cause, he said researchers and local governments will need to be careful about what signals they pay attention to so as not to be overwhelmed.
"Before people get flooded to the point where they can't easily make decisions on it, we need to make sure we're clear on what these signals mean," he said.