If you've ever been in one of Amazon's Go stores, you know the strange and somewhat magical feeling of checkout-free shopping. Walk in, grab your stuff, walk out. A receipt shows up on your phone a few minutes later. End of interaction. For Amazon, Go offers a chance to bring some of the convenience of online shopping to the real world. It also gives Amazon more insight into how people shop, what they look for and how stores themselves work. That data can be invaluable.
Standard is one of Amazon's leading competitors in this space, a company offering similar technology and features to existing stores all over the world. It already operates inside a network of convenience stores, and CEO Jordan Fisher said self-checkout is only the beginning. The company's underlying tech, which it calls The Platform, could someday bring powerful computer vision to nearly any physical space. Fisher and his team are trying to figure out how to make it work, how to reckon with the privacy implications and how to make checkout-free shopping even easier going forward.
Fisher joined the Source Code podcast to talk about what Standard is working on, what it's like to compete with Amazon, how physical and online retail are merging into a single shopping experience, how Instacart and Uber Eats are changing stores, how computer vision systems can be both pervasive and privacy-preserving and much more.
You can listen to our full conversation on this episode of the Source Code podcast. Below are excerpts from our conversation, lightly edited for length and clarity.
A few years ago, everybody said, "Computer vision, it's huge, it's going to be transformative." And there's a set of people who took that and said, "OK, self-driving cars." And I think they were right about the tech and turned out to be way wrong about the timeline. And then there are other people who said, you know, we're going to do creepy facial recognition stuff. Then somebody built Pokémon Go! And that's another version of the same thing.
Trying to find a way to work on a real version of this that doesn't suck, isn't life or death and also sort of makes sense to people, seems like a really hard thing to do.
So we'll talk about checkout here in a minute. But there's this underlying technology that we're building called The Platform. You have a physical space, you put cameras up overhead and you have this platform that's running off of those images coming off the camera. And what that platform gives you is basically an API for the space. It tells you where every person is in real time — down to the centimeter, like where their hands are, and where their shoulders are and how they're oriented, how they're moving. It tracks them across all the different camera perspectives, it can give you the shape of the 3D space, the layout, the boundaries, the items in the space, etc.
These are really core, fundamental concepts about physical spaces. But you want to access that with an API. So this platform sits on top of these cameras, and just gives you access to those insights and semantics of the space. And once you start thinking about that, and you see what the platform looks like, and the information it has, then your brain starts really running.
Here's some dumb ideas: One of my spaces is my gym. And one of the things I hate the most about being in the gym is that I want to track my progress, but it's a pain in the ass to use these apps to enter all the manual information. What would be great is you have this platform, and it understands all the motions you're doing, it knows what you're picking up, which dumbbells, etc. So it can just track all the reps, track the amount of time between reps, and keep track of that for you. And you can even imagine sort of as an AR system, it could be kind of giving you guidance. It can be tracking everything and giving you that feedback. So for me, that'd be kind of a really cool application.
The technology is probably not cheap enough yet to put it into every gym, but that's just one random thing. That same principle applies to manufacturing, warehousing, back of house in general, where you want to be tracking what's happening so that you can have better tabulations of your inventory, or really just enable folks to be more more effective, because you can give them better instructions that [are] going to optimize them through the space. So there's just a ton of disparate applications, but all in the physical world. How do we enable people by being the eyes and the intelligence that's understanding the space?
That vision sounds both like amazing, cool science fiction and terrifying, creepy, dystopian stuff, right? This is what we're reckoning with in so many places right now, the balance of cool, exciting, efficient and creepy. And I would assume this is something you have to start thinking about from day one as you try to build this stuff. So what does it look like to get that right?
So our perspective here has been that we need to build this in an intentional way. And it's not, "Hey, we're going to build this in a safe way and make sure that we're keeping your data secure." Of course, you have to do all those things. But everyone gets hacked, systems are going to break, it's going to happen at some point, right? So the better solution is we just won't do facial recognition. So no matter what happens, even if the system gets penetrated, that data is not there.
Of course, there will be some sensitive PII [personally identifiable information]. Maybe you do lose your credit card, for example, and you have to change that. But you're not going to get your facial biometrics hacked. So the maximum damage is capped to what you currently expect today, which is some PII that you can switch out and get back to normal.
There's certain types of biometrics that I think can be useful that aren't as extreme as facial biometrics. But for me, the bright red line, or the semi-bright red line is, can you use that biometric to re-identify the same person across different locations? Like if you walk into a 7-Eleven, and you sign up for our system, and then you go to your gym, can that gym recognize that you're the same person just based off of the biometrics? For me, the answer needs to be no.
You get a little bit into a gray area, because there is also advertising in physical retail. So now you suddenly have this interesting question, which is, can Standard get better insights into how shoppers are behaving inside of physical stores? And the answer is yes. Because I don't just know what you're buying, which is what stores today know, we have the ability to see things that are a little bit closer to what ecommerce could do. They know what you're hovering over, they know how long you're on a given page, they know that you put something into your cart but didn't actually end up buying it, and they send you that annoying email the next day.
Some of that actually is really useful. You can do preemptive warehousing when it's like, "I'm pretty sure that Susan's going to be buying this, so I'm going to go ahead and move it from East Coast to West Coast." But nonetheless, they're tracking everything, and it's all attached to your identity. And we can bring similar insights to bear inside of a store: What you're picking up and putting back down, we'll see that. And we'll know how long you're spending in front of the shelf.
Then it kind of opens up this question of, well, are we going to productize that? And again, I think there's a line here. It's got to be opt-in, we can't just be creating all this rich semantic data and then without your consent, hoovering it over. But we can use all the anonymous data in our system. All of our computer vision systems are fully anonymous. The only time we know who you are is when you're transacting, basically, and then we'll connect to a credit card system and put all the pieces together. But actually, all these really interesting insights from the platform are fully anonymous. And then you can use that in a sort of responsible way: You can say, we're not talking to you about Susan, but I'm talking to you about the last 100 customers that came into your store. They in general have been interacting with aisle six in this way, and they tend to be drawn to this part of the shelf. And maybe you want to optimize things in a slightly different way.
So I think there's a lot of ways you can be pretty responsible about this. All these things that we're talking about, we haven't even started building, but there's certainly conversations that are happening around it.
So you have this big pile of things you could work on with computer vision: How do you land on checkout-free grocery stores?
Well, we're researchers, so we wanted it to be a really interesting problem. And hard and big and audacious, but we didn't want to be working on this for the next 20 years, so it needed to be something that while it was going to take a few years, it was going to be sort of eminently possible. And then of course, we wanted to have a massive Total Addressable Market. You also want to have a rarefied space; you don't want to be competing with 1,000 companies. And just by virtue of being in a hard computer-vision space, it's already rarefied air.
The technology for doing autonomous checkout is very similar to autonomous vehicles. There's a lot of similarities, but the stakes are way lower. We can make mistakes. Retailers are already comfortable with mistakes, right? People steal things, cashiers make mistakes, sometimes the POS is wrong, whatever it is, so you have this built-in margin. The AV companies are trying to get to five nines, like 99.999% sure that we're not going to kill somebody. And we're like, no, we just need to get to like 97% sure that we're going to get this item right.
Did you know at the time that Amazon was working on something like this and was likely to be your main competitor? Because competing with Amazon is … not a lot of fun.
We came up with the idea about a week before Amazon announced, and of course they'd been working on it for seven years or something before that. It was this massive project for them that they just were really good at keeping under wraps. We definitely had a conversation, like, should we still pursue this? One, you don't want to go against Amazon. But for us, the more important thing was just as researchers, we like working on novel things, and it was like, well, Amazon's doing it, so it's not interesting anymore.
But we had one business guy in this discussion group who said that this is actually the best thing that could happen. It validates the market, it's a technological proof point, now investors are going to know this is possible, retailers are going to know this as possible. We're not just going to be some garage startup.
And most importantly, nobody wants to buy this from Amazon. As a retailer, Amazon is your hated enemy. You can't trust them. When we talk to retailers, they very much don't want this from Amazon. But they do want the technology.
To this day, we basically do no outreach, we just have inbound coming from hundreds of retailers. But you can see the spike whenever Amazon does something. Like when Amazon bought Whole Foods — which on the surface is very different than autonomous checkout, but was part of the same story that Amazon has this technology wedge, and they're going to use it to drive into the physical world — it was an amazing day for our inbox. Tons of retailers are like, "Oh, OK, I see Amazon Go, I see the Whole Foods acquisition, there's not many more dots that I need to connect to these things."
Over the last year, to some extent, the retail world has sort of come toward you, right? Checkout was a specific safety issue in the pandemic, and getting more people in and out of stores more efficiently was important. But on the other hand, it seems like a lot of physical retail has thought of new ways to exist online and make these systems work. Omnichannel is a term I hear 100 times more often than I ever did before the pandemic. Does your sense of what your place in the world looks like over time feel different now than it did 15 months ago?
I wouldn't say it's a different thing. But I do think it's all merging together: the need to innovate, the need to meet the shopper where they are, where they want to be, it's all accelerating. Because people just have higher expectations. And COVID has been a big part of that, especially for delivery and curbside. BOPIS, I guess they call it? Buy online, pick up in-store? That's one of my favorite retail words.
Today we're doing checkout. But the platform, I think, is broadly enabling. You've got the system in your store, it knows where all your shoppers are, it knows where all your products are, it knows where all your employees are. And we can facilitate all of the other aspects of omnichannel. A picker needs to come into the store to grab some items, we can actually facilitate that. Basically, it's like Google Maps for your store. And if you have a live view into a store, that understanding when something's out of stock, that's also hugely valuable for everyone along the chain.
I think what this ultimately looks like 10 years from now, it's an all-consuming platform, which maybe is coming from multiple vendors. And what you do is you think about your brand, you think about your relationship with your customer, you think about where you want to be and everything else is really taken care of for you on the platform. Replenishment and out of stock detection and integrating with the Instacarts and DoorDashes of the world, autonomous checkout from Standard, all the loyalty programs, etc.
And how much of that underlying infrastructure do you have ambitions to be over time?
I think a good part of it. We're not a delivery company, so that's not where we ever want to be going. But I think in terms of what's happening inside the store, it's our cameras on the ceiling, it's our platform that's got the insight into what's happening in the store. And that should be enabling for everybody, right? We can take that API and give it to DoorDash and Instacart. We can give it back to the employees in the store, to build apps that make their ability to service the store easier. But we want to be the ones that control — that sounds draconian — we want to be the ones that are enabling physical spaces.
This seems like a weird comparison, but take Uber and Lyft. You can think about them as this multi-sided marketplace, but they've taken away the notion of transacting. So you've got Lyft on your phone, and now you get into a car and you get out of the car. And that's your experience. So if you've got Standard on your phone, and not to have Visa come after me, but it's everywhere you want to be. You go into a store, any store that you want, grab stuff, walk out. It's the thing that's enabling you to forget about paying.