Amazon’s Toni Reid on teaching Alexa every language
She wants the voice assistant to be ubiquitous.
Many Amazon executives, when asked what their long-term goal is for Alexa, have told me the same thing over the years: to become a "Star Trek"-like supercomputer capable of virtually anything you throw at it with a simple voice command, including on-the-fly language translation in any language. As Amazon VP of Alexa Experience and Devices, Toni Reid is one of the executives charged with turning that vision into reality. No pressure.
Get what matters in tech, in your inbox every morning. Sign up for Source Code.
Reid oversees the development of new features across devices. Amazon has turned Alexa into the leading voice assistant worldwide, even eclipsing Apple's much older Siri in usage for the first time in 2019, according to estimates from Forrester Research. Reid has overseen Alexa's expansion into hundreds of millions of devices, from the commonplace (speakers, smart cameras and thermostats) to the outré (luxury toilets and motorbike helmets). But her goal is to be in every country, in every language, to turn Alexa into that supercomputer that can do everything.
Protocol sat down with Reid in Seattle recently to discuss how to get there, and what stands in the way, from privacy concerns, to Alexa's language limitations, to how you get people to use Alexa's more-complex skills. Oh, and about all the smart feedback she gets from kids who love Alexa.
This interview has been condensed and edited for clarity.
You joined the Alexa and Echo group over five years ago, roughly a year before they launched. But I think some people might be surprised to learn you actually don't have a deep, technical background. You earned your undergraduate degree in anthropology from the University of North Texas.
I actually started out in speech pathology, and I did that because I had a young cousin who was born profoundly deaf, and his mom had done this program where you teach profoundly deaf children to speak. They have very, very little hearing, and I was super intrigued by it. So, I started out in speech pathology, which I think is kind of interesting that over two decades later I ended up in the speech field, which was definitely not intended.
But through that process, I was taking anthropology courses, and it just clicked. I loved it. I loved the study of people, cultures and the uniqueness of them and the need to preserve them, and I like the history piece of it. I actually really liked forensic anthropology — the medical side. And so I didn't know exactly what I was going to do with the degree, but I was just super passionate about it.
But now, I think having studied it helps, if you look at the use of technology, the why and how people will use it, and how it might evolve. I sort of try to encourage — especially when I talk to young women or young people early in their careers — that [there] are other ways to come at technology. It doesn't have to just be a computer science degree.
When you joined Amazon's speech recognition team from the Amazon Fresh team in 2014, Alexa was still very much a secret project. But you'd heard bits and pieces through the grapevine, and then you'd collaborated with the Alexa team on the Dash Wand when you were still with Fresh, right?
That project, actually, we built and shipped it within 13 months. And we had very basic voice input. It just had the ASR, automatic speech recognition. And so I found out about this confidential project that no one would tell me what it was, but I thought I should go talk to this team. I went to meet with the team, and I told them, "I hear you have a technology." There were two individuals, and one of them told me there was no way they could help me with that at the time. "We've got to ship this product. [It's] going to be a distraction."
And the person who reported to him, who directly managed the team, he said, "We should do this. This is a customer need. It's simple, we can do it, and it's going to help us. So we shipped it together, and it was great.
That's when they asked, "Would you like to come over and join this team?"
Fast-forward, and Alexa is now the most widely used voice assistant in the U.S. according to Forrester, used more than Siri or Google Assistant. Have you been surprised by Alexa's success? It's baked into everything from speakers to lawnmowers now.
I mean, we knew we loved it, and that we thought customers would love it. But it was a completely new concept. And the thing we didn't know was it's a product that you don't know you need. And so, when you first launch something like that, how do you describe why you would need a speaker in the room that you can speak to? It seems kind of obvious now.
But back then, it just didn't exist. And to be honest … we all kind of struggled. You've got ambient noise, etc. And so, there was just an uncertainty about how much customers would love it. It wouldn't really be fair of me to say I knew then it was going to be as big as it is now, but we had really strong signals early on that we were onto something.
What kind of signals?
It was probably 18 months before it started to become [more mainstream]. I think even my first year or so in the role, people would ask me, "What do you do?" "I work for Echo and Alexa." "What is that?"
The high-tech early adopters were like, "Oh my gosh. I love this. Can you get me one? I've been on the invite list." So, that was one thing. Then the sort of more general population of people I'd run into would say, "What is that?" And so, there was definitely this point — a year, a year-and-a-half in — when it turned into, "I have feedback for you. I really want Alexa to do blah."
So that's a regular occurrence for you now? Strangers just offer unsolicited feedback on Alexa?
It depends. Sometimes, if I'm at a party, I say that I just work at Amazon.
But people will actually apologize to me. "I'm sorry I shouldn't, but can I tell you …" and I'm like, "No, tell me." But I really like to get feedback from different customer segments: kids, older generations, aging population, non-tech. They're unencumbered by technology and constraints. Technologists tend to know what's capable, and now they have their own biases they bring in. I actually find the ones that are unencumbered have some of the best free-form ideas. I really try to hear what they're saying when they're asking for features, or I wish it could do "X." And some of it seems completely far-fetched, you know, and I'm thinking, "Oh, that's a good idea."
I was at a dinner party, and there was an 11- or 12-year-old boy totally into technology. This was so early on. He sat down with me and he just went through this long list of requests he had, and they were really good requests. We were already working on many of them, like multiroom music. I loved it, because I thought this is somebody who's like, "Here's how that technology should work, and I don't care that it's hard. This is just what it should do."
Then you get that little bit from the aging population who I think is a bit more intimidated by technology, and voice is not difficult. It is difficult in other ways from a recognition perspective. But once they get started, it's just easy. It's like, "Oh, I can just turn on the lights or I can play music." You're going to get them over the hump sometimes because it's actually much easier than I think they think it will be.
But things haven't always gone smoothly. We're in a time now when people are much more conscious about their online privacy and safety. And just last December, for example, Bloomberg Businessweek ran a feature about Amazon contractors early on who once transcribed bits and pieces of conversations recorded by Alexa. How do you respond to people's ongoing privacy concerns around Alexa?
Privacy for us is foundational, and so customer trust is foundational for us. We work very, very hard to ensure that we're earning that trust and keeping it. And so when we think about building products, hardware or software features, we build in the kind of privacy-related customer trust from day one. So what's important to us is around transparency and control. Those are kind of our go-to. We want to be transparent with customers, and so we make it easy for them to understand, but we also want to give them control. And a lot of that's rooted early on. So the device, you can see this one that has an LED with a light ring. There's a lot of discussion in the design that went into the mute button and the mic and having it disconnected by hardware. And when you're speaking to the device, the LED light ring is on, and it indicates it's streaming to the cloud. And so we're really thoughtful about that. That's transparency, you know?
Last year, you introduced newer privacy features at a company event that seemed like they were in direct response to people's concerns.
Yes, for us it's just, it continues to be a focus area for us and everything we build. But then we've been building features and features and features on top of that to make it even better. Voice enabled, "Delete what I just said." "Why did you do that?" And so we are also looking for ways to improve. We have teams dedicated to that.
Over the last two years, the Alexa team has doubled down on expanding globally. Now, Alexa is available in 15 countries, seven languages and seven language variants, which is up from just three countries two years ago. What do you see as the potential on the international front?
We've also worked hard to train Alexa on regional phonetics so she can accurately pronounce names of important places, people, events, and more on a local level.
I want our experience to be ubiquitous. We started this past year to have multilingual experience. In the U.S., you can also speak Spanish. In India, you can switch it to Hindi. But we need to do more there. So it's to be able to understand any language, be able to speak back any language. If you think about like a true AI and sort of the supercomputer in the cloud, you really want to be able to expand that, and to sort of address a really large customer segment. And then for me, I've really enjoyed the international work because when we launch Alexa dedicated in the countries, we really take a lot of care to make the Alexa experience and the personality culturally relevant.
It's the same core personality, but it's very localized. It knows local fun facts and figures and you know, it takes a little adjustment around like, you know, maybe in Britain it's a little more dry sense of humor. And so I think the team's done a really good job at making them truly local experiences as well. But I think definitely, from a language perspective and understanding, it's just there's a lot of potential there.
But natural language processing, especially when you're dealing with different languages and dialects, must be difficult.
Accomplishing this isn't easy, sure. There are many things to consider when building Alexa specifically for another country. For example, the different accents and colloquialisms — even within the same country — can be very different. People in London pronounce things very differently than people in Glasgow, but Alexa in the UK can understand many accents and dialects from all around the UK.
With multilingual mode, we had to account for mixed language sentences and phrases like "Spanglish," and make sure Alexa knew how to respond. We've also worked hard to train Alexa on regional phonetics so she can accurately pronounce names of important places, people, events, and more on a local level. This attention to detail is incredibly important in order to create an authentic, local experience for customers around the world.
Not only are multiple languages spoken within a single country, but there are often multiple languages spoken within a single home (like my home where English, French and Spanish are all spoken). As we continue to build out the multilingual mode feature, Alexa will become even more natural for people around the world and from various backgrounds to interact with. Customers should be able to speak to Alexa in whatever way is most natural to them and get a meaningful response. That is the north star.
Correction: An earlier version of this story suggested Reid oversees the entire 10,000-person Amazon Echo team, when in fact she manages just the development of new experiences across devices, and is one of several executives charged with overseeing the Amazon Echo vision. Additionally, the earlier version did not accurately represent Reid's first interactions with the Alexa team. That answer has been updated for clarity. This story was updated Feb. 10, 2020.