Inside a startup’s pivot to have AI fight coronavirus
Insilico Medicine repurposed its machine learning systems to dream up treatments for coronavirus. Its CEO says it's going well — but the process isn't without hiccups.
With almost half a million reported cases of coronavirus around the world now, the race is on to develop a treatment. Alex Zhavoronkov, the co-founder and CEO of Insilico Medicine, is among those who think that artificial intelligence could help expedite a market-ready drug.
His Hong Kong-based biotech company has repurposed its platforms for the fight against COVID-19. It aims to use "a huge Lego system" of machine learning techniques for drug discovery — and, in this case, Insilico's algorithms have dreamed up tens of thousands of unique molecules that could potentially handicap the protein responsible for the virus' spread.
Get what matters in tech, in your inbox every morning. Sign up for Source Code.
The company's approach isn't without risk. In order to increase the potential drug's selectivity and decrease side effects, Insilico is banking on non-covalent-binding molecules. Unlike other treatments, these will only bind to what they're designed for: one type of COVID-19 protein. That means there's less of a chance of these molecules being successful — but if they do meet their goal, it'll ultimately be more accurate than another approach.
It also isn't without hype: The company has claimed that it can dramatically accelerate the drug development process using machine learning — and that it hopes to begin testing on humans within a couple of months.
Protocol spoke with Zhavoronkov about how Insilico repurposed its platform for virus research, the setbacks the company experienced after the onset of COVID-19, and why he believes machine learning could shorten the timeline for getting coronavirus drugs into clinical trials.
This interview has been edited for length and clarity.
Tell us about the before and after for your company when it comes to the novel coronavirus. What kinds of projects and technologies were you prioritizing before?
Wow, that's a very unusual question because it's an insider's view. We spent more than $10 million on a generative chemistry platform that essentially combines many machine learning techniques together — generative adversarial networks, reinforcement learning, genetic algorithms. It's essentially a huge Lego system of machine learning algorithms that can handle multiple cases.
We were focusing primarily on cancer, fibrosis and many other diseases, and we still are — so we haven't significantly pivoted in terms of our pipeline. However, we've prioritized several projects related to COVID-19 specifically. One of the three parts of our pipeline is focused on mining massive amounts of omics data to be able to find potential therapeutic targets. We've repurposed that for viruses right now.
When news of COVID-19 hit, what was the scramble like?
In early January, we learned about COVID-19. In mid-January, we recognized that there was a problem. In late January, I brought it to my board members and investors and said, "Hey guys, let's work on this." Some of them were not very happy about it because they thought it would go away, just like SARS. (If you remember what happened with SARS, everybody who started drug discovery programs lost because the funding disappeared, and it was really not prioritized as it should have been.) We ended up deciding to use our generative chemistry pipeline — which is ready and specifically designed for these kinds of cases — on some of the most validated, best targets for COVID-19. On Jan. 28, we started the program.
How would you describe that pipeline that you repurposed for this coronavirus?
You know how people make deepfakes — those kinds of imaginative videos, which are sometimes used for all kinds of malicious purposes? There's one type where you can create photographs with specific features and properties. We do the same thing but on molecules — we create molecules with specific properties. In layman's terms, it's essentially imaginative AI. You're creating billions of examples, some virtual, and making deep neural networks imagine new molecular structures that fit into a specific protein target.
Tell us more about how you're using reinforcement learning in this context. I hear you're employing it to encourage the molecules to go after specific objectives?
One approach is when you have a crystal structure of the protein target, so you know the shape and how [the protein] looks. Once we have that, we use AI to "sniff" the shape for binding pockets where a molecule can fit. (When you fit the protein with a small molecule, it disables it.) Once it identifies those pockets, it imagines new molecules — hundreds of thousands of molecules — that could fit onto it. Those molecules do not exist in nature. They are imagined.
And afterward, we have a set of reinforcement learning algorithms that essentially make this AI imagination more robust. You imagine with a specific objective in mind; you want to have very high activity, and you want to have selectivity, and you want to have safety. We generated a bunch of compounds: 100,000 unique molecules, then narrowed it down to 100 and chose seven from there.
How did the onset and spread of COVID-19 affect your lab processes, even as you were researching the virus?
We got into a little bit of trouble because, at that time, the epidemic was very active in Asia. And the way the drug discovery industry is organized … specifically for chemical synthesis, most people do it externally.
It's kind of like Apple, right? They don't assemble their own machines. They send it to Foxconn or some other contract manufacturer, which assembles it for Apple based on its instructions. That's what we're doing at Insilico: We don't have our own lab, but we work with more than 80 other labs worldwide, from which we order experiments.
About 90% of our chemistry is synthesized in China by contract research organizations. And it turned out that when we published our structures, it was still the middle of the Chinese New Year — and right afterward, the government extended the new year. Nobody was working. Nobody could synthesize it for us.
So we decided to publish the molecules. Previously, our generative chemistry approach was sometimes criticized for not making diverse enough molecules. This time, many medicinal chemists looked at the molecules, and they liked them.
How will you continue the process?
We've synthesized one molecule so far. Later on, we'll send it for the ultimate test: to see if it works in virus-infected cells. And after that, we need to do an animal test. Only after that can we test on humans. Technically this process, even though we've accelerated it, might take nine months to get the molecule into humans — maybe sooner, depending on how quickly we can ramp it up.
We've [also] identified a couple of other really promising targets, so we're going to publish more molecules.
You've said you hope to have something ready for clinical trials as soon as April. It usually takes more than a year to ready a prospective drug for human testing. How has AI whittled it down to months?
They'll probably be ready for human testing in a couple of months. But in about two weeks, we'll have the results from the assays, or experimental systems — we'll know whether those molecules worked. If you go the standard route, after you've proven the efficacy in vitro and in mice, then you have to apply for the phase one clinical study — and even the application takes a long time. It's more of a bureaucracy than anything else. And bureaucracy is there for a very good reason, so we aren't going to break any rules.
But when our system spits out the molecules, they are usually very druglike — they require either minimal modifications or none at all. The chances are very high that we can make the claim that they are ready because they have all the properties of a good drug. That's the beauty of AI versus everything else: We don't need to have many cycles of synthesis.
Get in touch with us: Share information securely with Protocol via encrypted Signal or WhatsApp message, at 415-214-4715 or through our anonymous SecureDrop.
So the reason the process usually takes about a year is that you typically need many cycles of synthesis, but since you're using these new techniques, you believe they'll likely be ready in just a few months?
That's absolutely correct. Also, here we took a little bit of a riskier approach. There are two ways of designing those molecules. We know other companies are working on what's called covalent inhibitors; they're inhibitors that bind to the main protease strongly as well as many other things, so they're less selective and there might be side effects. But we designed noncovalent binders. So far, there are no known noncovalent binders for the main protease.
In our case, it's really kind of like when you bet $1 million on a boxer. You're waiting for a match. My next two weeks are that: I'm waiting on those molecules to see which one works and which one performs better.