How IPFS is building a new internet from the ground up

Molly Mackinlay on Web3, web apps, content moderation and building an internet that lasts.

IPFS logo in Nyan Cat style.

The IPFS team is building a new kind of internet, memes and NFTs and all.

Photo: Protocol Labs

Molly Mackinlay loves the music app Audius, a decentralized tool that is trying to rethink the way artists own their music and interact with fans. She's a big believer in NFTs, and is looking forward to a world where everything from houses to cars are sold and tracked through the tokens. And she's definitely excited about the metaverse, as long as it's "crazy and open and enables all sorts of creation, which doesn't come from one single company running the metaverse."

In her day job at Protocol Labs (no relation to this publication), Mackinlay spends her time building the infrastructure that will enable all of that. She oversees IPFS, the underlying protocol that could be the future of how data moves between devices, networks and even planets. It's a job that requires wrangling thousands of developers and projects, prioritizing many different ideas about how the future of the internet should work, and trying to convince everyone to jump on board with the decentralization movement.

Mackinlay joined the Source Code podcast to discuss her vision for the future of the internet, what it takes to build an internet that never breaks or crashes and the opportunities Web3 holds for companies new and old, big and small.

You can hear our full conversation on the latest episode of the Source Code podcast, or by clicking on the player above. Below are excerpts from our conversation, edited for length and clarity.

Before you were at Protocol Labs, if I remember right, you were at Google. And then you left Google to come do this wacky, decentralized internet thing. Why? How did you get started working on this stuff?

I started participating in the IPFS open-source community back while I was still at Google. I was on the Google Education team, and I knew some of the folks who were brainstorming up the beginnings of IPFS from college. So I was hearing about this at the same time that I was working on the Google Classroom team, building tools for teachers and students.

I was the PM, designing and building mobile apps and was thinking about things from an offline perspective. Mobile phones, more so than the web product of Classroom, have this ability where you can take them all around with you and use them offline. And so the question was, what could we do from an offline perspective? And in observing classrooms and talking with teachers and students, we kept running up against, well, it'd be really great if a teacher and student who are in the same physical classroom with each other could, say, turn in an assignment the way they currently do with paper, which makes perfect sense. But when you go to the digital world, the way that we've architected the internet and all of these technologies that we depend on every single day, it requires you to sync all of your data to some distant servers somewhere in the ether. And then hopefully, they sync all of that content and sync it back to your teacher.

If your school internet is terrible, which it is almost everywhere in the world, you really struggle with that. And so students were spending like five, 10 minutes of their class time just trying to turn in assignments and sync documents and download videos, and they're all trying to do it at the same time. The exact same document, the exact same video. It's like, this is so broken, the web should work differently.

And that's kind of when I was talking about these problems with some of the folks in the IPFS community. And they're like, "this is exactly our vision for how the internet should work. It should be local-first, it should be peer to peer, you should be able to collaborate with any other person, any other node in the network directly without having to go through an intermediary." It would enable a school with terrible bandwidth to sync a piece of content once and then distribute it to all of your peers and collaborate on it together. That would be a new fundamental way for the internet to work that would just unlock a whole new set of capabilities.

Going to Mars ... you're not going to want to sync every document or every website or every collaborative edit of a Google Doc 14 minutes to fetch content.

And even more futuristic, as you think about us becoming a space-traveling civilization in the next couple of decades, going to Mars, going to the moon, setting up bases there, you're not going to want to sync every document or every website or every collaborative edit of a Google Doc 14 minutes to fetch content. You'd want to be able to do things local-first, you wouldn't want to have to have all of these intermediaries and latencies.

Wait, so IPFS standing for Interplanetary File System is not a joke or tongue-in-cheek. It's for real.

It's an homage to [J.C.R.] Licklider's Intergalactic [Computer] Network, which was an early name for the internet. But it very much is at the core of it. We're not there yet, but when you really stretch it, what would this entail? How is the human race going to evolve? What sort of tools and technologies for this layer that we all depend on, and we build our core capabilities on — how do we want it to evolve? What primitives do we want it to have?

Some of those primitives are user agency, peer-to-peer collaboration and hyper resiliency to all sorts of faults and issues. We can never predict what sort of natural disaster or otherwise is going to disrupt our systems. And so we need to make sure that humanity's information doesn't all get centralized in the Library of Alexandria and then burned. We want it to be hyper-distributed, replicated, resilient across many, many different parts throughout all of the galaxies that humans eventually colonize.

Before we get too deep in this, just give me the basic, kindergarten-level explanation of what IPFS does.

The way I explain IPFS is that it takes the internet model of fetching content and finding data, and instead of addressing everything by its location — what entity in the network is hosting this, facebook.com hosts this picture — it tries to address things by what they are. By the content itself. So a picture, when hosted by anyone, would have the same address, and anyone would be able to fetch it. And they would know that this is the exact picture they were looking for. And they could do that from anyone in the network. They don't have to trust Facebook, who says "yes, this is the picture you were looking for." They can do it from their neighbor, or some random stranger on the internet, and that's all totally fine.

That shift in model unlocks a whole set of new capabilities. The thing that relies on is peer-to-peer addressing and content addressing, so, being able to actually network computers to talk to each other directly, and then enabling them to address the content that they're looking for, by its cryptographic hash, the digital fingerprint of this file.

A great analogy that a friend of mine uses is, imagine that you're trying to find a book. And instead of telling you what the book is, and letting you identify whether or not you've already read it or have it with you or anything like that, they tell you "well, this book is located in the New York Public Library, on the third floor, fourth shelf from the right, top row, three books in." OK, now you have to go there physically in order to identify what the book actually is. You have to travel to New York, you have to wait until the library is open, you have to get a pass, you have to go in. And then you have to look at the book, and you're like, "Oh, crap, I had this in my backpack the whole time." When you use location addressing, which is kind of how the web was initially built, you don't have the capabilities that we wish it had today.

And so at the core of it is switching the web from using location addressing to content addressing, which then unlocks being able to fetch content from anyone. And IPFS is a kind of file system and set of tools built around trying to make that really easy for building new applications. It has a file system-like access, so you can add files and fetch files from different peers in the network, all using this technology.

The thing that comes up the most often when people talk about IPFS is permanence: You can't lose data. You can't censor data. It lives beyond any individual person's control. And I think that's really powerful. Are there other sort of core tenets of what you're trying to do that are on that same level? Or is everything kind of a knock-on effect from that permanence?

To me, "user agency" is actually the term I use more than permanence. Permanence is really important from an addressing perspective, but when you frame it from the lens of user agency, it's really about giving you control over your own experience browsing on the web. That implies permanence, because if you want to have a piece of content on your machine, you want to host it, you want to access it, that's your node's prerogative. But it doesn't necessarily mean that you can force someone else to store content on the internet that they don't want to. And it means that if you don't want to load content that some other node happens to be storing, you don't have to.

It's really about giving you control over your own experience browsing on the web.

Trying to really put the tools and control in the hands of the individual of the node operator is one of the primitives I think is really important. And what that implies around permanence and censorship is that we as a society become more flexible. It leans more into things like freedom of speech, where we're trying to navigate that effectively with each other's peers, and that you as node operator are responsible for the speech you make within the network.

One of the criticisms people have of systems like this is that you can only support them if you're like a free speech purist, who believes everything should be available to everyone, nobody should be able to take anything down. How do you think through that tension in your head, between good governance and taking care of people versus this core belief in free speech?

It's a tricky subject, right? I think there's some really, really valuable work that's being done by groups like the Element team at the Matrix Foundation, where they're working on distributed content moderation schemes, whereby you as an individual or a collection of individuals can help filter your own content. You can subscribe to the set of filter lists you want on the internet. Maybe you hate seeing pictures with the color blue, for some reason, and so you could subscribe to a filter list that's like, "no blue images, blue images make me sad, I don't want any of those!" And you can curate the set of things that fall within that filter list, and your experience of the web is tuned for what you as a node operator wants.

I think where you really don't want to get into is where those things are applied on top of the user, restricting their own way of accessing the internet, which is just how you get into all sorts of really bad forms of content moderation that we've seen play out super poorly in the Web2 world. It's a centralized group that's now put in the position of having to moderate all policy on the internet across every global border. And this is non-trivial, and probably shouldn't be in the hands of, say, YouTube's centralized content moderation team to be making all of those decisions for everyone around the world.

I think it's going to evolve to effectively look like you as a node operator, as someone who's publishing content, can always take that content down. No one can force you to keep content around. And you can absolutely delete content from the network, that's totally possible. But when it comes to any central node being able to black out a highly desired and valuable and useful piece of content from all nodes across the internet, that's also probably not something that we want to put in the hands of one central party. So it's a balance, how we build up the right set of tools that enable groups to self-moderate the content as it relates to their network, without giving them the reins to black out everyone's, you know, Wikipedia page with words and names that they don't want included.

You're making me realize that you basically have to totally rethink what apps and platforms mean and look like. Do you find yourself meeting with people and having to pick up pieces of their brain off the floor, 10 minutes into your conversations?

And pieces of my own brain!

I remember in my first couple months on the IPFS team, sitting down and thinking about, OK, when I first got started in computer science, one of the easy things to do was get started building little games. So a little chess app, a little Sudoku app, how would I do that in a peer-to-peer way over Web3? So, OK, chess is relatively easy. You just alternate between nodes being able to make moves. And our nodes would be connected to each other. But if you went offline, and you wanted to rejoin, how would you find the other node? If you wanted to have a timer, whose version of time would we be using? There's actually an entire blockchain for this now, that gives you a decentralized time-stamping service.

You have to think about how you architect your applications in a different way, when you move into Web3. Back then, if I wanted my own decentralized time-stamping solution, I would have had to come up with my own solution to that problem. Now, there's a really easy API, I could subscribe to that.

You have to think about how you architect your applications in a different way, when you move into Web3.

But that's the layer of additional thinking about the problem and additional infrastructure that needs to be created so that it's really easy for an application developer to lift their brain from the Web2 side of the world and transfer it over to the Web3 world without scattering some pieces. It definitely takes some getting used to.

That's one of the reasons why you see so many of the Web3-native applications growing from scratch within this new area of the world, because it's difficult to transition a massive enterprise over into this new model. I think some groups have been more successful at kind of growing new endeavors, like maybe the Microsoft ION group, where they're growing kind of a new Web3 identity initiative. But they're not trying to just transition an existing Web2 model and presume that all the Web3 primitives are going to be one-to-one with how they would have designed something in Web2. It definitely still takes some work, and the work that we're doing every day is making it more accessible and easier for people to make that transition.

Have you spent time talking to folks at the Googles and Facebooks and Microsofts of the world about what all of this looks like? A lot of people are going to start to ask, "How do I build a $100 billion business out of this?" Which is a complicated question. And other people are going to say, "I already run a $100 billion business. What am I supposed to do with Web3?" Are the answers to that getting better on your end?

Yeah. I mean, the homegrown businesses within Web3 are absolutely scaling. OpenSea is now what, a billion dollars per month or something crazy? There are definitely businesses that are homegrown within Web3 that are quickly scaling to top some of those lists and be very interesting from a traditional Web2 scale perspective.

When the conversation comes to large, existing businesses, the feedback I give is to start understanding the space first. Don't start with, "great, we're going to transition an existing business directly on top of this network." Start with, what are the components of your business that make the most sense?

We've actually been working with a ton of super large institutions who have massive datasets that they want to be hyper resilient against all sorts of faults, and stored cheaply around the world, to utilize things like IPFS and Filecoin for that purpose. Groups like the Internet Archive, the USC Shoah Foundation, the Max Planck Institute and others. Those are big, brand-name institutions who are coming over into the space and seeing the value that it can provide to them, but they're not taking an existing cookie-cutter application and not rethinking some of the kind of interface points, or how to make the best use of this technology for their purpose.

Personally, I think that the pathway for big businesses into Web3 will be one that charts the incremental value that they get at each point. The first step, working with someone like Netflix, is looking at their build systems, and how you offer incremental value in decentralizing a build system using content addressing. Great, that makes incremental sense for a piece of the business. And then you go from there step by step until voila, you have Web3-ized all of the core pieces that the entire business is built upon.

To the point about getting more people on board, it seems like the next key phase of this is to get beyond the enthusiasts, who are willing to do the work and are interested in this on its own merits. There's this big group of people who think this stuff is interesting … because they think it's interesting. And then there's a much bigger group of people who just absolutely could not care any less. And that's ultimately the group of people you have to get, right? They just want to check their email, or listen to music or find a document. How are we doing it at starting to move toward some of those people?

I would personally classify myself as someone who does not fall into the "thing, but decentralized" camp. The reason I got excited about it in the first place was the capabilities it offered. I want teachers to be able to return assignments to students in the same classroom. That's not "a thing, but decentralized," that's a new capability that should be part of the web. And we should just make it freaking work.

That's not "a thing, but decentralized," that's a new capability that should be part of the web. And we should just make it freaking work.

A lot of folks who are in the ecosystem already and coming to the ecosystem today, they do see new capabilities that are coming out of the decentralization layer that weren't there before. And they're coming for those capabilities, and for the opportunities and businesses that can be built bringing those opportunities to the masses.

A lot of the groups who are building applications in the Filecoin/IPFS space today, like they are leaning into what is possible. It's not just "x but decentralized," it's "x but no one can intermediate the music creator from their audience," or "we can collaborate offline on our Google Doc without having any other node," or "we can store massive amounts of video data for teachers who are teaching in India."

Those are capabilities, and people who see how to compose these new primitives into unlocking those capabilities are being successful in the Web3 space today. And I think folks who got into it in the early days are being augmented by a much larger cohort of people who are getting into it to demonstrate how that makes life better for everyone. So yeah, I think that wave has already begun. And I'm really excited to see it continue.

We're not yet at the late majority of folks who are like, "I don't want to have to do any work, and I don't want to have to invent the future, just give me the thing that already works." We're still in the builder phase. But it's not just the decentralization, it's builders who want to change the world.

Fintech

Judge Zia Faruqui is trying to teach you crypto, one ‘SNL’ reference at a time

His decisions on major cryptocurrency cases have quoted "The Big Lebowski," "SNL," and "Dr. Strangelove." That’s because he wants you — yes, you — to read them.

The ways Zia Faruqui (right) has weighed on cases that have come before him can give lawyers clues as to what legal frameworks will pass muster.

Photo: Carolyn Van Houten/The Washington Post via Getty Images

“Cryptocurrency and related software analytics tools are ‘The wave of the future, Dude. One hundred percent electronic.’”

That’s not a quote from "The Big Lebowski" — at least, not directly. It’s a quote from a Washington, D.C., district court memorandum opinion on the role cryptocurrency analytics tools can play in government investigations. The author is Magistrate Judge Zia Faruqui.

Keep ReadingShow less
Veronica Irwin

Veronica Irwin (@vronirwin) is a San Francisco-based reporter at Protocol covering fintech. Previously she was at the San Francisco Examiner, covering tech from a hyper-local angle. Before that, her byline was featured in SF Weekly, The Nation, Techworker, Ms. Magazine and The Frisc.

The financial technology transformation is driving competition, creating consumer choice, and shaping the future of finance. Hear from seven fintech leaders who are reshaping the future of finance, and join the inaugural Financial Technology Association Fintech Summit to learn more.

Keep ReadingShow less
FTA
The Financial Technology Association (FTA) represents industry leaders shaping the future of finance. We champion the power of technology-centered financial services and advocate for the modernization of financial regulation to support inclusion and responsible innovation.
Enterprise

AWS CEO: The cloud isn’t just about technology

As AWS preps for its annual re:Invent conference, Adam Selipsky talks product strategy, support for hybrid environments, and the value of the cloud in uncertain economic times.

Photo: Noah Berger/Getty Images for Amazon Web Services

AWS is gearing up for re:Invent, its annual cloud computing conference where announcements this year are expected to focus on its end-to-end data strategy and delivering new industry-specific services.

It will be the second re:Invent with CEO Adam Selipsky as leader of the industry’s largest cloud provider after his return last year to AWS from data visualization company Tableau Software.

Keep ReadingShow less
Donna Goodison

Donna Goodison (@dgoodison) is Protocol's senior reporter focusing on enterprise infrastructure technology, from the 'Big 3' cloud computing providers to data centers. She previously covered the public cloud at CRN after 15 years as a business reporter for the Boston Herald. Based in Massachusetts, she also has worked as a Boston Globe freelancer, business reporter at the Boston Business Journal and real estate reporter at Banker & Tradesman after toiling at weekly newspapers.

Image: Protocol

We launched Protocol in February 2020 to cover the evolving power center of tech. It is with deep sadness that just under three years later, we are winding down the publication.

As of today, we will not publish any more stories. All of our newsletters, apart from our flagship, Source Code, will no longer be sent. Source Code will be published and sent for the next few weeks, but it will also close down in December.

Keep ReadingShow less
Bennett Richardson

Bennett Richardson ( @bennettrich) is the president of Protocol. Prior to joining Protocol in 2019, Bennett was executive director of global strategic partnerships at POLITICO, where he led strategic growth efforts including POLITICO's European expansion in Brussels and POLITICO's creative agency POLITICO Focus during his six years with the company. Prior to POLITICO, Bennett was co-founder and CMO of Hinge, the mobile dating company recently acquired by Match Group. Bennett began his career in digital and social brand marketing working with major brands across tech, energy, and health care at leading marketing and communications agencies including Edelman and GMMB. Bennett is originally from Portland, Maine, and received his bachelor's degree from Colgate University.

Enterprise

Why large enterprises struggle to find suitable platforms for MLops

As companies expand their use of AI beyond running just a few machine learning models, and as larger enterprises go from deploying hundreds of models to thousands and even millions of models, ML practitioners say that they have yet to find what they need from prepackaged MLops systems.

As companies expand their use of AI beyond running just a few machine learning models, ML practitioners say that they have yet to find what they need from prepackaged MLops systems.

Photo: artpartner-images via Getty Images

On any given day, Lily AI runs hundreds of machine learning models using computer vision and natural language processing that are customized for its retail and ecommerce clients to make website product recommendations, forecast demand, and plan merchandising. But this spring when the company was in the market for a machine learning operations platform to manage its expanding model roster, it wasn’t easy to find a suitable off-the-shelf system that could handle such a large number of models in deployment while also meeting other criteria.

Some MLops platforms are not well-suited for maintaining even more than 10 machine learning models when it comes to keeping track of data, navigating their user interfaces, or reporting capabilities, Matthew Nokleby, machine learning manager for Lily AI’s product intelligence team, told Protocol earlier this year. “The duct tape starts to show,” he said.

Keep ReadingShow less
Kate Kaye

Kate Kaye is an award-winning multimedia reporter digging deep and telling print, digital and audio stories. She covers AI and data for Protocol. Her reporting on AI and tech ethics issues has been published in OneZero, Fast Company, MIT Technology Review, CityLab, Ad Age and Digiday and heard on NPR. Kate is the creator of RedTailMedia.org and is the author of "Campaign '08: A Turning Point for Digital Media," a book about how the 2008 presidential campaigns used digital media and data.

Latest Stories
Bulletins