Google is rethinking search because TikTok and podcasts are taking over the internet

Multimedia search is a core focus for Google going forward. So is understanding an increasingly multimedia internet.

Phones with different search screens

Google's emphasis on visual search is about the search bar ... but also about the internet.

Photo: Google

One of Google's favorite statistics is that every day, roughly 15% of Google queries are for things that have never before been typed into the search box. And even at Google's impossible scale, the number never seems to go down. "Part of it, I have to admit, is that people find new and creative ways of misspelling words," Pandu Nayak, a Google fellow and the company's VP of search, told me earlier this month. But there are two other reasons, he said: The world changes all the time, and people's curiosity "is quite infinite in its complexity."

Google's challenge on the web is to find ever-better ways to collect and sort information. Crawling web pages is the easy part, relatively speaking. Understanding what's authoritative vaccine guidance and what's dangerous misinformation, or whether you typed "spaghetti" looking for definitions or recipes? That's all much more complicated. Nayak rattled off numbers like the 3,600 changes made to the search system last year, or the 60,000-plus experiments run internally. It's a lot of work, but Google's better at it than most.

But there's a core change happening on the internet that threatens Google in a serious, potentially existential way. An increasingly large amount of the web is not web pages full of text and hyperlinks. It's images, video and audio. TikTok and Instagram, podcasts and videos: Those platforms are just as much "the internet" as the Wikipedias and publisher sites Google has long relied on. And for the company that has spent two decades dedicated to organizing the world's information, that presents a problem.

At Google's Search On event on Wednesday, Google executives showed off some fancy new features, like a camera feature that can take a picture of a shirt and find socks with the same pattern. Or a way to take a photo of your broken bike chain and get search results for how to fix it. It's all part of Google Lens, the visual-first search system the company has been building for several years. Google has long talked about wanting to take search beyond the text box, to make it easier for people to input information and get answers. Context is crucial to that, too.

But just as important, and just as difficult, is understanding the information on the other side. It's technically possible to search TikTok and Instagram through Google, but the results are pretty primitive and mostly based on hashtags and video descriptions. Google is reportedly working on deals with ByteDance and Facebook to bring more content with better metadata into Google's search results, but that, too, is only half the battle.

Even on YouTube — itself the world's second-largest search engine, and obviously a Google-owned company — Google's search relies on metadata and automatically generated transcripts to figure out what's going on in a video. Introducing chapter markers made the system better, but only because creators gave Google hints about where to look. Its search crawlers don't understand what's on the screen in any meaningful way.

When he introduced Google's new Multitask Unified Model system (or MUM, as it's known) at Google I/O in May, Nayak hinted that things might be about to change. "MUM is multimodal," he wrote in a blog post, "so it understands information across text and images and, in the future, can expand to more modalities like video and audio." He echoed the sentiment in our conversation. "You can give [MUM] inputs that are both text and images, as a sequence of tokens," he said. "It only thinks about tokens … and it essentially learns the relationships between image tokens and word tokens, and I think we'll see a number of interesting examples coming out of that." He said that's not coming immediately, but "in the maybe not-too-distant future."

If Google can unlock a truly visual search engine in both directions — visual queries, visual data, visual output — it can be much better equipped to be to the future what, well, Google was to the past. More than two decades ago, the company took a disparate set of content and put it at users' fingertips. Now the content has changed, but the need hasn't.

The other upside for Google? Shopping. Practically every corner of the internet is embracing shopping as a way to make money, both for creators and for the platforms. For Google, the potential is massive: It could allow users to click on any product in any video or image anywhere on the internet, from the gadget in the foreground to the lamp in the background to the shoes on creators' feet, and be taken to a store to buy that thing. MUM could help Google build the world's biggest catalog, with Google as a happy fulfillment and payment service.

Companies around the industry, from Spotify to Pinterest to Apple to practically every other platform and service that deals in audiovisual content, are trying to figure out how to better understand and index the content in their systems. Google, as the trillion-dollar tech giant predicated on understanding and indexing all content everywhere, is in a high-stakes race to do it better.


Judge Zia Faruqui is trying to teach you crypto, one ‘SNL’ reference at a time

His decisions on major cryptocurrency cases have quoted "The Big Lebowski," "SNL," and "Dr. Strangelove." That’s because he wants you — yes, you — to read them.

The ways Zia Faruqui (right) has weighed on cases that have come before him can give lawyers clues as to what legal frameworks will pass muster.

Photo: Carolyn Van Houten/The Washington Post via Getty Images

“Cryptocurrency and related software analytics tools are ‘The wave of the future, Dude. One hundred percent electronic.’”

That’s not a quote from "The Big Lebowski" — at least, not directly. It’s a quote from a Washington, D.C., district court memorandum opinion on the role cryptocurrency analytics tools can play in government investigations. The author is Magistrate Judge Zia Faruqui.

Keep ReadingShow less
Veronica Irwin

Veronica Irwin (@vronirwin) is a San Francisco-based reporter at Protocol covering fintech. Previously she was at the San Francisco Examiner, covering tech from a hyper-local angle. Before that, her byline was featured in SF Weekly, The Nation, Techworker, Ms. Magazine and The Frisc.

The financial technology transformation is driving competition, creating consumer choice, and shaping the future of finance. Hear from seven fintech leaders who are reshaping the future of finance, and join the inaugural Financial Technology Association Fintech Summit to learn more.

Keep ReadingShow less
The Financial Technology Association (FTA) represents industry leaders shaping the future of finance. We champion the power of technology-centered financial services and advocate for the modernization of financial regulation to support inclusion and responsible innovation.

AWS CEO: The cloud isn’t just about technology

As AWS preps for its annual re:Invent conference, Adam Selipsky talks product strategy, support for hybrid environments, and the value of the cloud in uncertain economic times.

Photo: Noah Berger/Getty Images for Amazon Web Services

AWS is gearing up for re:Invent, its annual cloud computing conference where announcements this year are expected to focus on its end-to-end data strategy and delivering new industry-specific services.

It will be the second re:Invent with CEO Adam Selipsky as leader of the industry’s largest cloud provider after his return last year to AWS from data visualization company Tableau Software.

Keep ReadingShow less
Donna Goodison

Donna Goodison (@dgoodison) is Protocol's senior reporter focusing on enterprise infrastructure technology, from the 'Big 3' cloud computing providers to data centers. She previously covered the public cloud at CRN after 15 years as a business reporter for the Boston Herald. Based in Massachusetts, she also has worked as a Boston Globe freelancer, business reporter at the Boston Business Journal and real estate reporter at Banker & Tradesman after toiling at weekly newspapers.

Image: Protocol

We launched Protocol in February 2020 to cover the evolving power center of tech. It is with deep sadness that just under three years later, we are winding down the publication.

As of today, we will not publish any more stories. All of our newsletters, apart from our flagship, Source Code, will no longer be sent. Source Code will be published and sent for the next few weeks, but it will also close down in December.

Keep ReadingShow less
Bennett Richardson

Bennett Richardson ( @bennettrich) is the president of Protocol. Prior to joining Protocol in 2019, Bennett was executive director of global strategic partnerships at POLITICO, where he led strategic growth efforts including POLITICO's European expansion in Brussels and POLITICO's creative agency POLITICO Focus during his six years with the company. Prior to POLITICO, Bennett was co-founder and CMO of Hinge, the mobile dating company recently acquired by Match Group. Bennett began his career in digital and social brand marketing working with major brands across tech, energy, and health care at leading marketing and communications agencies including Edelman and GMMB. Bennett is originally from Portland, Maine, and received his bachelor's degree from Colgate University.


Why large enterprises struggle to find suitable platforms for MLops

As companies expand their use of AI beyond running just a few machine learning models, and as larger enterprises go from deploying hundreds of models to thousands and even millions of models, ML practitioners say that they have yet to find what they need from prepackaged MLops systems.

As companies expand their use of AI beyond running just a few machine learning models, ML practitioners say that they have yet to find what they need from prepackaged MLops systems.

Photo: artpartner-images via Getty Images

On any given day, Lily AI runs hundreds of machine learning models using computer vision and natural language processing that are customized for its retail and ecommerce clients to make website product recommendations, forecast demand, and plan merchandising. But this spring when the company was in the market for a machine learning operations platform to manage its expanding model roster, it wasn’t easy to find a suitable off-the-shelf system that could handle such a large number of models in deployment while also meeting other criteria.

Some MLops platforms are not well-suited for maintaining even more than 10 machine learning models when it comes to keeping track of data, navigating their user interfaces, or reporting capabilities, Matthew Nokleby, machine learning manager for Lily AI’s product intelligence team, told Protocol earlier this year. “The duct tape starts to show,” he said.

Keep ReadingShow less
Kate Kaye

Kate Kaye is an award-winning multimedia reporter digging deep and telling print, digital and audio stories. She covers AI and data for Protocol. Her reporting on AI and tech ethics issues has been published in OneZero, Fast Company, MIT Technology Review, CityLab, Ad Age and Digiday and heard on NPR. Kate is the creator of RedTailMedia.org and is the author of "Campaign '08: A Turning Point for Digital Media," a book about how the 2008 presidential campaigns used digital media and data.

Latest Stories