Why audio will never capture the hearts of social media users

Stop trying to make audio happen. At least until you nail the AI-discovery and audio clipping technology.

Headphones with very long wires

While the podcasting industry has grown substantially, there so far hasn’t been a similar market for short-form audio.

Illustration: Boris SV/Moment/Getty Images

Audio is the white whale of social media. A TikTok- or Twitter-like platform for audio recordings sounds like a solid bet on paper. Audio is intimate and imaginative. The stakes are lower, and the costs more accessible, compared to recording video content. Best of all, social audio appears to be new and exciting — like it’s never been done before.

“Every couple of years, a new audio social media platform emerges, excites us through its novel approach, and briefly captures our collective attention,” said Michael Mignano, co-founder of podcast platform Anchor and a partner at VC firm Lightspeed. While the podcasting industry has grown substantially, with more than one-third of Americans listening to podcasts regularly, there so far hasn’t been a similar market for short-form audio.

No one has definitively cracked the code on how to entice people, on a large scale, to engage with audio-only content like they do videos or text, and any success has been short-lived. After capturing imaginations during the pandemic, live chat platform Clubhouse receded from the mainstream. Twitter pulled resources from its Clubhouse clone, Spaces, in June.

Meta shut down its short-form audio Soundbites and podcast hub in May. Startups like Shuffle, which billed itself as the TikTok for podcasts, have also shut down. Others like Snipd, an AI-based podcast app that lets users create and scroll through podcast snippets, have just started chasing the audio dream, convinced its take on social audio might have the right formula to finally take off.

Apple and Spotify, the preeminent podcasting platforms, are perhaps best positioned to experiment with social, shareable audio; Spotify has perhaps come the closest with its year-in-review Spotify Wrapped slideshows. Both declined to speak on the record, but pointed Protocol to blog posts about how they segment podcast episodes. Spotify acquired a company called Podz in 2021 that generates audio clips, hinting that the company may invest more in the social discovery aspect of audio. But its plans are unclear so far.

Will it ever be audio’s time to shine? It’s at a disadvantage in our short-attention-span economy filled with shiny images and the never-ending scroll, founders told Protocol. Listening is an inherently passive experience, making it more difficult to sell to investors and the average user.

“We’re doing something else when we’re listening to audio, right?” said Cliff Lampe, a University of Michigan professor specializing in digital communication. “That’s not true with either reading or video because it grabs more of our attention.”

Audio has a virality problem

“Is a hit machine for audio even possible? And is it something anyone even wants?” asks journalist Stan Alcorn at the top of his viral 2014 article, “Why Audio Never Goes Viral.”

Eight years later, the question remains. Audio is still more difficult to share on the internet, despite countless startups and platforms offering solutions to the problem. It’s not because companies haven’t tried hard enough — users just aren’t that interested.

“It competes poorly with video when it comes to user-generated enthusiasm, basically,” said Brian Lamb, co-founder of education tech company Swivl. “If you’re trying to get people to author original content, they’re more likely to want to just do video.” Swivl used to offer a user-generated audio platform called Synth that has since shut down.

Audio faces an uphill battle. For one, you can’t skim it easily. You have to listen to audio linearly, making it an inefficient mode of consumption. It also doesn’t require all your attention; many people listen to podcasts while running errands, working out, or cooking. “The reality is you can consume content so much more quickly and efficiently through your eyes than you can through your ears,” Mignano said.

If you’re trying to get people to author original content, they’re more likely to want to just do video.”

Because of this inherent barrier to listening, it’s harder to convince creators to invest in stellar content. Recording professional audio is hard enough at its core. Lampe also pointed out that it’s a rare skill to speak well, “especially using the cadence and emotion, reducing your number of ‘ums’ and hesitations.”

Chris Messina, social tech expert and inventor of the Twitter hashtag, said the cognitive cost for listeners creates a very high bar for audio content. “If I want someone to listen to what I have to say, I better be fucking, like, interesting,” Messina said.

Both Mignano of Anchor and Lamb of Synth started with the idea of an “audio Twitter” in which users could record original, short-form voice content. It’s not a bad idea; Twitter itself has audio origins with defunct podcast platform Odeo. But in Anchor’s case, Mignano quickly realized that while there was demand to create audio content, it hadn’t reached a critical mass. Plus, the content quality was poor without creative tools. Even a grainy, poorly recorded video can go viral. But choppy audio is far less engaging.

Once Anchor released creator tools, the social audio began to resemble podcasts. But who wants to visit Anchor, the new kid on the block, for podcast-like audio when you could get essentially every podcast from Apple or Spotify?

“We transitioned into being a really easy to use podcasting platform,” Mignano said. “With the tap of a button, instead of just publishing to your social graph inside of the Anchor app, we published it to Spotify and to Apple podcasts and all these places.” The model worked, and the switch ultimately led to a Spotify acquisition in 2019.

Lamb ran into the same problem with Synth. While the tech was sound, it didn’t have the content users wanted. Lamb pivoted from offering user-generated content to offering both manual and automatic podcast-snipping tools. Still, it didn’t resonate with consumers. Lamb shut down the consumer side of Synth a year and a half ago and officially shut the tool down for educators a month ago..

It’s niche

Kevin Smith, CEO of AI startup Snipd, said he’s not building a social audio app at the moment. Snipd lets users manually clip podcasts, automatically segments podcasts into highlights and chapters, and has a “for you” page full of clips with transcripts as the visual. Audio information can easily get lost in the big bad podcast universe. Smith wants to help listeners harness it.

“Our goal is to build an app that unlocks the knowledge in podcasts,” Smith said. “If the social aspect helps with that, then we don’t have a problem with it.”

He’s interested in tackling the barrier to discovering podcasts and making them easier to consume. Turning podcasts into bite-sized bits helps, but it’s also more social as well. It’s easier to share audio nuggets than the whole meal. Smith says Snipd appeals to a wide range of users, and he’s optimistic it will continue to grow. He credits the latest advancements in natural-language processing for making Snipd’s mission possible. Transcribing audio has become easier, as is segmenting it into core parts.

The Snipd app open on a handful of smartphonesSnipd automatically segments podcasts into highlights and chapters, and lets users manually clip podcasts. Image: Snipd

But it’s especially struck a chord with the productivity community. For obsessive notetakers, being able to embed audio snippets into your Notion- or Obsidian-based second brain (productivity speak for note-taking system) is a win. “We didn’t expect that a certain percentage of our users would be so enthusiastic about plugging this into your second brain,” Smith said.

Finding a niche is powerful, and it can surround a product with passionate, dedicated users. But serving only a small subset of people can also hold products back from attaining mainstream popularity, the kind of success that VCs and media generally expect from ambitious startups. Synth, for instance, had a built-in user base of educators because of its parent company. It allowed teachers to incorporate student voices in class projects.

“In education, it’s a little bit different because someone’s requiring you to do it,” Lamb said. “There are very different motivators involved in capturing and creating that content.”

But Synth didn’t succeed when it came to the general consumer market. Shuffle faced a similar issue. Ada Yeo, co-founder and CEO of the now-defunct company, said the tool had early power users from tech Twitter. But the product was split between users who wanted the tool for note-taking purposes and general users who might have used it to share podcast clips on social media.

“We just couldn’t find a way to crack other communities,” Yeo said, referencing the worlds of comedy and sports podcasts, “that would lead to a more mainstream product.”

Podcasting itself is a niche form of entertainment compared to TV or movies. Anybody can create a podcast now, but in Messina’s words, this means a lot of the audio out there is “shit and probably should never have been recorded.” Excellent audio certainly exists, but blockbusters are rare. With a crowded audio market, there are fewer listeners to go around, making it harder for audio platforms to scale to the level of a social network. Maybe committing to the niche is the answer, Messina suggested.

“We evaluate [social audio] and define its success based on the size of success we’ve seen before,” he said, “as opposed to saying, ‘Actually, this can be a very healthy ecosystem, but it’s a small suburb of social media land.’”

Audio’s saving grace: AI or Spotify

If any type of audio platform were to grow into a proper social network, experts agree it would have to focus on short-form clips. This American Life rolled out a product called Shortcut in 2016 that was meant to “make podcasts as shareable as GIFs” (RIP GIFs, by the way). But it doesn’t appear to have caught on, and six years later, Shortcut is still in beta.

Smith says Snipd’s AI features may make the process of creating clips less time-intensive, while also making it more likely users see audio they like. Snipd’s AI discovery algorithm is far from perfect, but Smith said the team is working on improving it.

Messina said we have to be less precious about the way we consume podcasts. Allowing AI to chop podcasts into shareable bits makes audio easier to consume, and a social audio platform more viable. “In the future, it may become easier to remix, reshape, snip, and share those audio moments,” Messina said.

Apple is generating transcripts to help with search results, but they’re not publicly available to listeners. Spotify has transcripts only for its original shows so far. Messina thinks Spotify might be moving toward enabling social audio, allowing users to share response clips to podcasts.

Mignano, who left his role as Spotify’s head of talk in June, declined to speak about specific plans but emphasized the company’s ambitions to foster more podcast creators. “Spotify has been pretty public around its ambitions to be a platform for tens, if not hundreds of millions of creators,” Mignano said. “The company continues to do a lot of work to make it easier for nonprofessionals to make podcasts.”

“[S]ocial audio has come at the wrong time at the moment.”

While small startups may conceive of better recommendation engines, the popularization of social, short-form audio depends largely on major platforms and whether they decide to invest in supporting it. For companies like Apple or Spotify, experimenting with short-form audio makes sense. For the major social media platforms with visuals or text as the standard, audio feels more like just one feature among many.

“It certainly feels like something that’s untapped and ripe for experimentation, but social audio has come at the wrong time at the moment,” said social media expert Matt Navarra. “Twitter’s in disarray, Meta’s placing its bets on a very small number of things it thinks is going to work out.” In other words, audio is not on many companies’ list of priorities.

Mignano recently wrote about podcasts increasingly becoming more visual, with podcast creators releasing video segments on TikTok. Though Mignano believes in the beauty of audio, he doesn’t think it’s ready for social media prime time.

“Audio is just unique in the world of social,” Mignano said. “It’s incredibly rich, it’s intimate, it's immersive, but it has this disadvantage.”


Judge Zia Faruqui is trying to teach you crypto, one ‘SNL’ reference at a time

His decisions on major cryptocurrency cases have quoted "The Big Lebowski," "SNL," and "Dr. Strangelove." That’s because he wants you — yes, you — to read them.

The ways Zia Faruqui (right) has weighed on cases that have come before him can give lawyers clues as to what legal frameworks will pass muster.

Photo: Carolyn Van Houten/The Washington Post via Getty Images

“Cryptocurrency and related software analytics tools are ‘The wave of the future, Dude. One hundred percent electronic.’”

That’s not a quote from "The Big Lebowski" — at least, not directly. It’s a quote from a Washington, D.C., district court memorandum opinion on the role cryptocurrency analytics tools can play in government investigations. The author is Magistrate Judge Zia Faruqui.

Keep ReadingShow less
Veronica Irwin

Veronica Irwin (@vronirwin) is a San Francisco-based reporter at Protocol covering fintech. Previously she was at the San Francisco Examiner, covering tech from a hyper-local angle. Before that, her byline was featured in SF Weekly, The Nation, Techworker, Ms. Magazine and The Frisc.

The financial technology transformation is driving competition, creating consumer choice, and shaping the future of finance. Hear from seven fintech leaders who are reshaping the future of finance, and join the inaugural Financial Technology Association Fintech Summit to learn more.

Keep ReadingShow less
The Financial Technology Association (FTA) represents industry leaders shaping the future of finance. We champion the power of technology-centered financial services and advocate for the modernization of financial regulation to support inclusion and responsible innovation.

AWS CEO: The cloud isn’t just about technology

As AWS preps for its annual re:Invent conference, Adam Selipsky talks product strategy, support for hybrid environments, and the value of the cloud in uncertain economic times.

Photo: Noah Berger/Getty Images for Amazon Web Services

AWS is gearing up for re:Invent, its annual cloud computing conference where announcements this year are expected to focus on its end-to-end data strategy and delivering new industry-specific services.

It will be the second re:Invent with CEO Adam Selipsky as leader of the industry’s largest cloud provider after his return last year to AWS from data visualization company Tableau Software.

Keep ReadingShow less
Donna Goodison

Donna Goodison (@dgoodison) is Protocol's senior reporter focusing on enterprise infrastructure technology, from the 'Big 3' cloud computing providers to data centers. She previously covered the public cloud at CRN after 15 years as a business reporter for the Boston Herald. Based in Massachusetts, she also has worked as a Boston Globe freelancer, business reporter at the Boston Business Journal and real estate reporter at Banker & Tradesman after toiling at weekly newspapers.

Image: Protocol

We launched Protocol in February 2020 to cover the evolving power center of tech. It is with deep sadness that just under three years later, we are winding down the publication.

As of today, we will not publish any more stories. All of our newsletters, apart from our flagship, Source Code, will no longer be sent. Source Code will be published and sent for the next few weeks, but it will also close down in December.

Keep ReadingShow less
Bennett Richardson

Bennett Richardson ( @bennettrich) is the president of Protocol. Prior to joining Protocol in 2019, Bennett was executive director of global strategic partnerships at POLITICO, where he led strategic growth efforts including POLITICO's European expansion in Brussels and POLITICO's creative agency POLITICO Focus during his six years with the company. Prior to POLITICO, Bennett was co-founder and CMO of Hinge, the mobile dating company recently acquired by Match Group. Bennett began his career in digital and social brand marketing working with major brands across tech, energy, and health care at leading marketing and communications agencies including Edelman and GMMB. Bennett is originally from Portland, Maine, and received his bachelor's degree from Colgate University.


Why large enterprises struggle to find suitable platforms for MLops

As companies expand their use of AI beyond running just a few machine learning models, and as larger enterprises go from deploying hundreds of models to thousands and even millions of models, ML practitioners say that they have yet to find what they need from prepackaged MLops systems.

As companies expand their use of AI beyond running just a few machine learning models, ML practitioners say that they have yet to find what they need from prepackaged MLops systems.

Photo: artpartner-images via Getty Images

On any given day, Lily AI runs hundreds of machine learning models using computer vision and natural language processing that are customized for its retail and ecommerce clients to make website product recommendations, forecast demand, and plan merchandising. But this spring when the company was in the market for a machine learning operations platform to manage its expanding model roster, it wasn’t easy to find a suitable off-the-shelf system that could handle such a large number of models in deployment while also meeting other criteria.

Some MLops platforms are not well-suited for maintaining even more than 10 machine learning models when it comes to keeping track of data, navigating their user interfaces, or reporting capabilities, Matthew Nokleby, machine learning manager for Lily AI’s product intelligence team, told Protocol earlier this year. “The duct tape starts to show,” he said.

Keep ReadingShow less
Kate Kaye

Kate Kaye is an award-winning multimedia reporter digging deep and telling print, digital and audio stories. She covers AI and data for Protocol. Her reporting on AI and tech ethics issues has been published in OneZero, Fast Company, MIT Technology Review, CityLab, Ad Age and Digiday and heard on NPR. Kate is the creator of RedTailMedia.org and is the author of "Campaign '08: A Turning Point for Digital Media," a book about how the 2008 presidential campaigns used digital media and data.

Latest Stories