Uber has been inadvertently publishing information that would have allowed anyone to track in real time the start and end points of trips on its Jump electric bikes or scooters — and therefore, the location of the people riding them. As recently as Tuesday, this data was shared publicly on government websites in the U.S. cities where Jump operates.
Uber has since fixed the flaw, and there's no indication it was ever exploited. But the issue cuts to the heart of a growing debate about how tech companies share data with local governments, what rules should govern that data sharing, and whether it's even possible to do it in a way that protects people's privacy.
The Uber issue stems from the way the company shares data about its bikes and scooters with cities. Across the country, as government officials grapple with the proliferation of so-called micromobility companies, they've required companies like Uber, Lyft, Bird, Lime and others to share information about their traffic patterns. (For more background on that, check out David Pierce's deep dive into the battle around these requirements.)
To respond to those demands, the industry developed a set of standards called the General Bikeshare Feed Specifications to help these companies share data about where and when scooters and bikes are traveling. In lots of cities, that real-time data is made public through APIs on government websites.
The problem is, Uber was sharing one data point that's not required in those specifications: the unique name of every bike and scooter in its fleet.
Here's why that's an issue: If an Uber user in, say, Baltimore, opened up the app and tapped on a nearby scooter to reserve it, Uber would show the user that scooter's name — something like "JUMP Scooter XPC664." That way, you'd know you were getting on the right scooter. But Uber was also accidentally sharing that same name through its public API, along with the real-time latitude and longitude of where that particular scooter started and ended a trip.
With a little technical knowhow, a savvy stalker could, in other words, follow a neighbor or ex to the site of a Jump scooter, and either log the scooter's ID by reading it off the side of the scooter or by opening the Uber app and reading the ID of whichever scooter the person picked. It would have been easy then for the stalker to mine the API on Baltimore's department of transportation website to see where the rider hopped off.
Clearly this is not the sort of flaw that's ripe for abuse at a mass scale, but it does allow for a significant privacy invasion at the individual level.
The privacy flaw was discovered by John Myers, co-founder and chief technology officer of Gretel.ai, a startup that's working on ways to help developers access and share large datasets without compromising people's privacy. He discovered the data coming from 17 cities and posted about the issue on Github on Tuesday. A fellow Githubber who identified himself as a representative of Uber's Jump team responded moments later saying he'd fix the issue right away.
"You hit the privacy implications on the head here," the Jump team member said in his Github response.
Uber's head of security, privacy and engineering communications, Melanie Ensign, confirmed that by Tuesday afternoon, the company revoked public access to vehicle names, but said that they may have been exposed since late 2019.
Uber's inadvertent disclosure of vehicle information just reveals precisely why we're so concerned about the aggregation of location information, both in the hands of the private sector, as well as the hands of cities that desperately want this information. — Mohammad Tajsar
According to Ensign, Uber was sharing vehicle names in order to comply with a specific requirement from Miami. Miami is one of several cities that uses a different type of data-sharing framework called the Mobile Data Specifications, which were developed by the Los Angeles Department of Transportation. Uber has been embroiled in an ongoing battle with LA over the MDS framework. The company argues MDS is overly invasive because it requires companies to share location data about trip routes, not just where a trip starts and ends. Miami's director of innovation and technology, Mike Sarasti, told Protocol that the city intentionally opts out of collecting this in-trip data.
"We only receive data about idle, inactive scooters for enforcement purposes to make sure that they are not being dropped off in disallowed areas," Sarasti says.
According to Ensign, Jump began sharing the vehicle name with Miami as part of this workaround. (Sarasti did not specifically answer Protocol's question about this). But after Uber acquired Jump and their internal infrastructure merged, the vehicle name began appearing in the API for every city.
"We offered them this as an alternative, but making it publicly available in all these markets was definitely not the intent," Ensign said. Now, only authorized city personnel can access vehicle names.
For Myers, who reported the flaw to Uber, this is the perfect use case for what his team is building. The company helps developers access and share data using an emerging technique known as differential privacy, where anonymous datasets are injected with noise to prevent any one data point from being matched to real people.
"All types of data can be used to violate privacy," Myers said. "Keeping data safe and private, while still allowing developers to innovate, is one of the hardest problems out there, and that's what we're working to solve at Gretel."
This incident is also a prime example of why privacy groups have publicly opposed cities' demands for this data, said Mohammad Tajsar, a staff attorney at the ACLU of Southern California.
"Uber's inadvertent disclosure of vehicle information just reveals precisely why we're so concerned about the aggregation of location information, both in the hands of the private sector, as well as the hands of cities that desperately want this information," Tajsar said. "The location data of the type that scooter companies, and now cities, collect is incredibly revealing about people's lives in ways that should really force the city leaders and the public to think carefully about why they need this granular information and what risks they're putting their residents in when amassing this sensitive information."
The Electronic Frontier Foundation has also objected to these data-sharing agreements, particularly in Los Angeles. "Unfortunately de-identification is kind of a myth especially in the context of location data," said Bennett Cyphers, a staff technologist at the EFF. "It's extremely, extremely difficult, and often impossible, to sufficiently anonymize or de-identify data such that it can't be tied back to a specific person and reveal sensitive things about that person."
Uber isn't the only company in the bike- and scooter-sharing business that's faced these types of problems. Last year, Quartz was able to trace the journeys of 129 Bird scooters in Louisville, Kentucky, using scooter ID codes shared publicly by the city. According to Quartz, that code was later stripped out of the data. And Ensign herself found that Wheels, an e-bike company, is also sharing unique vehicle identification numbers in its API. Wheels did not immediately respond to Protocol's request for comment.
Because there's no central repository of these APIs (though Github has a fairly lengthy list) it's unclear how many more transportation companies have the same issue. What is clear is that in their push to better inform their citizens about the tech companies taking over their streets and sidewalks, local governments may be putting those same citizens at risk.