Phil Libin knows his way around a platform shift. At Engine 5, he was one of the early founders betting that the internet was going to change the way people do everything. At Evernote, he led one of the first App Store success stories. Now, as the co-founder and CEO of Mmhmm, he's convinced we're at the beginning of a change that will be just as massive.
So far, most people's experience with online video starts and ends with Zoom meetings. But from fitness instructors teaching virtual classes to massive online conferences, practically everyone has had to figure out how to make their world work through a webcam. Libin thinks that's not going away, even when we're able to be in person again. Why? Because he thinks there are things that work better on video, that we might never want to do live even when we can. (He has a whole theory about not going to the doctor anymore, for one thing.) The future is a hybrid of digital and physical, live and on-demand, and it's going to create a massive industry unlike anything we've seen since the web.
Libin came on the Source Code Podcast to talk about Mmhmm, his vision for the future of video and what the future holds for Zoom, Teams and the rest. Plus, he has a few tips for how to be a better on-camera performer. Because like it or not, that's what we all are now.
Subscribe to the show: Apple Podcasts | Spotify | Google Podcasts | Pocket Casts | RSS
Below are excerpts from our interview, edited and condensed for length and clarity.
Why start a video thing now? In a world where it feels like Zoom is this all-encompassing, totally dominant thing that has sort of won the space where we talk to each other on video, what led you to think there is room for you?
We didn't think too much about it, in those terms. It started as a joke. We were building it for ourselves, because we were all on Zoom and everything else. And it was just kind of dull, and we were goofing around trying to make people smile a little bit. So we didn't have, you know, big total addressable market or competitive analysis thoughts.
OK, so start with the theory that Zoom is sort of dull, and you could make it better. What did better look like, in an abstract way, as you were starting to build stuff?
I don't think it's so much that Zoom is dull. I think Zoom is a great company and a good product. And we're not competing with Zoom! In fact, Zoom is how most people use us; we work with all of the video services. We really just thought that when the pandemic happened, and we were all kind of forced indoors and on video, a lot of things became much less effective because I thought people just didn't know how to perform on video. It didn't doesn't come naturally to people.
And so many of us that were decent at our jobs, in the before times, had figured out how to be entertaining and captivating, and what passes for charismatic in person. Because you kind of have to be, to be effective at any job. And then we were all on video, and no one knew how to do that. And everything just became dull and dreadful, and more important, like, ineffective.
And we just thought, "Well, not everyone sucks on video." There are whole industries of people who are quite good on camera: actors and musicians and athletes and whatnot. What do they know? What can we learn from them? And then, how can we give everybody else the ability to be as captivating on video without needing a video production team and all that stuff? So that was really the thought: We would just make you a better performer, we would make you more captivating on video, regardless of where that video was. Zoom or YouTube or WebEx or whatever. And that's what we're doing.
Is that obviously a different skillset? Is running a meeting over Zoom a fundamentally different thing than running a meeting full of people in a room together?
I think it is. And I also think that it's not even about meetings. I think what happens is whenever there's a major change in technology and in the world — and because of the pandemic and video, we've had sort of several major changes [that] were on top of each other — I think the first-generation thinking by the incumbents is people who are trying to recreate the old reality in the new technology.
So when movies were first invented, films had movie cameras, but they didn't understand anything else about cinema, and their old reality was theater. And so a lot of the early movies are just, you know, a camera filming two actors on the stage. It looks weird. And I'm sure there were all sorts of theater critics in 1915 that were like, "Oh, this newfangled cinema is a really poor replacement for theater, you'll never replace theater." And they're right. But the whole point is, movies aren't meant to replace theater. They're an entirely new thing. But that took a few decades to figure out.
And then the same thing happened with TV, and then the same thing, even in my lifetime, happened with smartphones. When we were starting with Evernote 2007-2008, the major change was smartphones had become ubiquitous, but no one knew what smartphone software was going to be like. Microsoft would say, "Well, we know how to make Microsoft Word. And smartphones are smaller than PCs. So maybe smartphone software is just like a smaller version of PC software." It wasn't until the native generation of Evernote and Uber and Dropbox and Airbnb that people figured out that actually, smartphone software has nothing to do with PC software.
And now that's happening with video. You've got, you know, the incumbent players are sitting around saying, "Well, OK, well, the old reality is meetings. So what were meetings like? Meetings were a bunch of boring people sitting together in a room with a set agenda. But it can't be in the same room anymore, because of COVID. So how do we get as close as possible to a bunch of boring people in a boring room … but on the computer?" So that's what we have for video, for first-gen. And that totally makes sense. That's what you expect. But of course, the real answer, once we and other people invent the native-to-video experiences, there's not going to be anything like meetings. So you need a different way to be charismatic, and it's going to be an entirely different feeling anyway. It's going to be nothing like meetings in a few years.
It seems like what you're saying is video is … I hate the words platform shift. But it's a cultural moment as big as something like smartphones or TV. You think it's that big a thing?
Yeah, I think it's as big as the internet. As big as .com, let's say. It wasn't so much that the internet replaced everything, it's that it embedded itself into the fabric of just about every life transaction. And I think the same thing is happening with video. It doesn't replace in-person interactions. It's just that video is going to be an essential, everyday component to many, many, many things for close to 100% of people and companies.
This is a totally pedantic question, but when you say video, what do you mean? You could go from YouTube to Zoom to Mmhmm to me, shooting on an old camcorder from 20 years ago: That could all be video. When you think of this industry, where does it begin and end?
Yeah, I think it's basically a remixing and resampling of reality on two axes. One axis of in-person versus online, and the other axis of live or synchronous versus prerecorded or asynchronous.
If you think about nine months ago, the before times, almost every experience fit neatly into just one of those categories. Like doctor visits were live and in-person. University classes were live and in-person. YouTube videos were prerecorded and online. There weren't a whole lot of things that existed in multiple quadrants at the same time. But now that's all changing. Almost all of life is being reimagined to be a mixture of each of those four buckets, based on the most effective result.
I'm never going to go back to the way that I used to go to doctors' offices after the pandemic, because it's better this way, right? It's better to do everything online, get prescriptions in the mail, not have to wait in line, not sit in traffic. But then if I need to come in, obviously that'll happen as well. So health care is fundamentally hybrid forever, as will education be, and sales, and just about everything else. So I guess video is kind of the lazy shorthand version. I don't mean necessarily just somebody looking at a video camera, I really mean taking an experience that used to be entirely in one of those four boxes, and reimagining it so that it has components in each of the four.
What is your sense of how big this thing is going to be?
I think it's potentially massive. We're calling this whole industry "personal video presence" or "professional video presence." The idea again, is that the transformation is just like with .com. Just like with the early days of the web.
Over the next couple of years, we're going to see literally close to 100% of people organizations embrace hybrid video for some stuff. And just like with the early days of the web, there were early adopters that could wire it together themselves. But then the tools came along to make that possible for everyone, and not only did the internet become much more accessible, it became much better. You could do a lot more with modern tools than you could do by yourself, typing HTML into Emacs. And I think we're seeing the same thing with video.
I think when we're this early into a multitrillion-dollar transformation, no one has any idea of the line between platform and applications. Is Zoom an application, is it a platform? Is Mmhmm a platform? Is there Mmhmm for dentists and for dog washers? Are we building that? Is someone else building it? No one knows. People didn't know in 2007, 2008 around mobile.
And so for the next like, year and a half, there's going to be this massive, turning cauldron of everyone trying to kind of figure this out. And all the big companies are obviously in the mix. Cisco is trying to do a lot, Zoom, obviously Microsoft, and all of these companies have creative, brilliant people. They're going to do stuff. And there's a bunch of startups. We don't know how it's going to come out. We don't know who's going to be a great partner, we don't know who's going to be our arch-nemesis competitor, we just have no idea. We just need to be around for it. We need to be there when it happens, which is why we decided to raise some significant money and hire a bunch of people. We want to, together with our partners/friends/competitors, figure out this new world, trying to make it better than it would otherwise be.
And I think you know, things will, things will start, like, the landscape will start becoming a little bit clearer in 18 months.
Before I let you go, I want to talk about Phil Libin, the on-camera, on-video performer. Tell me what you've learned about how to be good at video.
Well, I'll let you know as soon as I've learned it.
You said you're better on video than you are anywhere else! So you're the one who's good at it, and you've got, like, The Chainsmokers invested in your company, they know how to do it.
That's part of it, right? Part of it was just realizing that I sucked at it. And that most of us do.
I think a lot of a lot of people don't want to think of themselves as performers. But they are, and they just don't admit it. They want to think that, you know, they're brilliant at their job and thinking about how to be a better performer is kind of beneath them. If you think like that you're going to suck.
And I think even all of those people who were actually effective, they had all internalized and figured out how to be a good performer in person without thinking about it, because otherwise they wouldn't have been successful. And it's just refusing to acknowledge that e-charisma, being charismatic on video, is different than in-person charisma.
But what have you learned? You have to give me something here. Be my management coach.
I guess the single most important thing, if you want to get into tactics, is lighting. I'm just sitting in front of a window. And I've got lights if it's not daylight. The single thing that almost everyone could do, that would massively improve things, is sit facing a window, if possible. Get some decent lights.
Other than that, it's get a microphone, it's think about your hand position. I tend to wave my hands around a lot, that's OK, I just kind of lean into it, but try to be aware of it. Think about where you're looking: I've got my camera set up so that it's easy to look at, so I'm not all shifty-eyed. Most of the time it's basics like that that you can learn from.
I really do think it starts with some knowledge that this needs to be intentional. It's not about necessarily high-production value. It's about intentionality. You can be very effective with fairly little hardware and software and setup.