The future of farming is one giant A/B test on all the crops in the world at once
John Deere subsidiary Blue River Technology is using computer vision and machine learning to make the farmers of tomorrow more efficient.
John Deere has been making farming equipment for more than 180 years, adapting and evolving with the times. But as the world's population continues to balloon and the number of farms in the U.S. falls, farmers need to be able to do more with less to maximize the amount of crops they can harvest. For years that's meant more efficient motors, tillers and even GPS systems, but the next frontier lies in automation.
Deere acquired Blue River Technology, an agricultural AI and robotics company, back in 2017, to help it on that mission. At the time, Blue River was primarily focused on trying to make lettuce growing more efficient, but Deere shifted its focus onto some of the most lucrative crops in the country — soy and cotton. Deere's goal, at least for now, isn't to replace the farmer with autonomous robots (although it is making its machines much easier to pilot), but rather to make its equipment as effective as possible to help farmers increase their yields, according to Chris Padwick, Blue River's director of computer vision and machine learning.
On Friday, Padwick published a post about the company's new "See & Spray" technology research. Using computer vision and a carefully constructed database of plant imagery, Padwick and his team have shown that it's possible to identify weeds between crops and just spray the offending plants with herbicide. It could dramatically reduce the amount of herbicide farmers need to use while putting fewer chemicals onto the food we eat and increasing the effectiveness of each harvest. And the AI system is built upon PyTorch, the open-source, machine-learning platform spearheaded by Facebook's AI research group. If Padwick's team is able to take this research and commercialize it, it could help change the way crops are grown around the world.
Protocol spoke with Padwick about his team's research, who chooses to work in the field of agricultural AI, and the role of automation in creating the crops for the future of humanity.
This interview has been edited for length and clarity.
What's it been like working with Deere since the acquisition, and how does it fit in with the work you've been doing?
When I interviewed at Blue River, I was just floored by the talent and capability that the team had accumulated. When I joined the company, the engineering team was really very tiny. At the time, the product we were working on was something called lettuce thinning. The idea is when you're doing lettuce planting, you're actually planting too many plants so that they have a higher chance of emergence, and once they emerge, you've got a problem. These plants compete with each other, so you have to go through and weed them out. Prior to Blue River, that was a manual process, but we brought an automated farming machine that was pulled behind the tractor and figured out which ones to keep, which ones to weed out. This was a pretty successful product, and at the peak of its production, if you had eaten a piece of lettuce in the U.S. there was a 20% chance that it had been touched by one of our machines. When I joined Blue River, I was totally blown away by how a really small team of engineers had built this fascinating, fantastic product. And I basically said, OK, I have to work here.
We operated the lettuce thinning machine for a fair bit of time, and then we switched gears to row-crop weeding because we saw a huge opportunity there. If you could bring a row-crop weeding solution into cotton and soybeans, the market share that you get, over lettuces [was] orders of magnitude larger. Though it comes with a big challenge of we [have to weed] in the furrow or in the row. We actually call this a "green-on-green problem" where instead of discriminating green from brown — brown being soil — now you have to actually recognize differences between plants and say, "Oh, this was a soybean plant, and this is a velvet leaf plant, and I want to spray the velvet leaf, but not spray the soybean plant."
So it becomes a much harder problem. And that's where we started to embrace deep learning. We were able to show very conclusively in 2017 that you could use deep learning to build a product that would do up to 90% herbicide savings in certain fields. We also proved that you could build the whole robotics stack, and so then when we were acquired by John Deere, this was really very interesting to them. And on the other side of things, we were not experts in manufacturing. As you can imagine, manufacturing high quality, large ag equipment at scale is an incredibly difficult thing to do.
So the prospect of growing that function without a partner to do that was extremely daunting. There was a nice synergy there. And every time I give a talk at a conference, somebody says, "Nothing runs like a Deere, right?" Everybody knows Deere stuff is high quality and lasts forever. Now we can take our technology, we can integrate it on something that already exists and provide a lot of value to the customer from that perspective.
Let's talk about this project in particular. What led you to going about trying to solve weeding this way?
So when we were working on lettuce thinning, we had this dream about plant-by-plant management. The idea goes, to use a not-perfect metaphor: If I find out that somebody at school has a cold, then one strategy would be, I could distribute cold medicine to everybody in the school, and everybody gets medicine, whether they have a cold or not. Or I could just give cold medicine to the person that has a cold. When you look at the toolset that farmers have, they don't have that option right now. With broadcast spraying, you have a large sledgehammer — precision spraying like we're talking about would be a scalpel. The scalpel gives you the ability now to go and target individual plants, and try to implement a plant-by-plant management strategy. That would nearly be impossible to implement manually because the average cotton field has many thousands of seeds in it. This lends itself really well to automation.
So the idea behind Blue River was if you can optimize the treatment for each plant, then that produces a global optimization that can really help you solve the problem you have. Our roadmap includes things like crop protection and nourishment, as well. Our first target is: kill the weeds, do a great job and prove that AI can really improve your farming operation and provide a lot of value to the customer. Then once we have that in place, start to build on top of that and move into crop protection, insecticide, fungicide, and a list of things that we want to hit.
And for this specific project, where did the training data primarily come from for the plants you're identifying?
There are public data sets available. There's one called DeepWeeds from Australia that was collected to recognize different weed species using deep learning. If you do a little bit of research on Kaggle, you can find RGB images of soybeans. So we did like every other AI company does: We start with what's out there. The dream of how a lot of people build AI businesses is to take a public dataset, collect a little bit of data, fine-tune it and release a product. Unfortunately that doesn't work for the problem that we're trying to solve. What we've done actually is, starting in November 2016, we started building this global image dataset, that has representation from different countries in the world, different crops, and different growth stages and different weed types. One of the things that makes this really tough to solve is that weeds that occur in the Midwest don't occur in Brazil, weeds that occur in Brazil don't occur in Australia. We're trying to build a global product, which means we really need to build this global dataset.
And that's core to our strategy to solving this problem — variety in the data set, representations of all the kinds of conditions that you can run into in the field. It's a very complicated problem because whenever you're talking about an AI system and a camera system, which keys off the visual appearance of the plants, anything that affects that visual appearance can materially affect the model result. An example is hail-damaged cotton. Hail comes through and punches a hole in the leaf, and the area where the hole is turns yellow. So all of a sudden you've got this visually different looking cotton leaf. And as AI models are famous for being able to interpolate well but not extrapolate — if that is outside of your training-data set, then you're probably likely to call that cotton plant a weed and not do what the customer expects.
So we're building a global image database of millions of images that we have made sure we have appropriate representation from hail and drought and all the different things that can affect the visual appearance of the plants.
Along with that strategy comes a pretty challenging labeling problem. If you're at home and doing a tutorial on TensorFlow or PyTorch and you're building a cat-versus-dog detector, you can get basically anybody to provide labels for you. Humans are really good at telling cats from dogs — you don't need a specialized workforce. You could spin up a Mechanical Turk job and get some reasonable labels. With weeds though, it's a lot different. Often weeds and crops look the same. An example here would be cotton and a velvet leaf or morning glory. Those are two types of weeds. Sometimes we call those "mimic weeds" where they actually look a lot like the thing we're trying to protect. So, that makes it hard because now the labeling staff needs to understand, "Oh, this is a cotton plant, and this is a morning glory plant."
So we need a specialized labeling force. We've hired a set of agronomy folks. These are people, sometimes with PhDs in the field of weed science, and they've been looking at pictures of weeds and plants for many, many years, they've all walked the fields in the summers, doing a lot of agronomic studies and research. These people really know how to tell the difference between these kinds of plants. With any AI project, if you get garbage data in, garbage out follows pretty easily. So we need to make sure that those labels are very high quality going into our product.
Blue River's AI discerning between cotton plants (green) and weeds (red). Blue River Technology/Medium
Are there any plants in particular that are harder to distinguish than others? Are there other crops where doing this work would be harder because of this distinguishing work that you'd need to do?
There definitely is. In the U.S., particularly in the Midwest, we have a very big problem with herbicide-resistant weeds. If you're a farmer and you think of the tools that you have in your toolbelt, you're missing a tool to go and attack these weeds. Pigweed is public enemy number one in a lot of places in the U.S. It's almost entirely herbicide-resistant. There are some exceptions, but that's largely true, especially if it gets above a certain size, it becomes extremely hard to kill. The other factor working against you is some species of these pigweeds can have up to a million seeds in the seed bank. There's stories of farmers going out of business because they couldn't control their herbicide needs. In terms of which weeds are more important, when [farmers] see a pigweed, they absolutely must kill 100%. You cannot miss a single one of these. Grasses on the other hand, those are, those are potentially less of a problem depending which part of the country they're in.
So if you were thinking about like, what's the performance on pigweed versus grass? I'm willing to actually let a little bit of grass go if I'm a farmer, because I know that that's not going to affect my yield, as long as it doesn't get out of control. But boy, one pigweed, you got to kill that. That's what our customers have told us about the problem. And so I would say there's a category of weeds where we have to have really high performance and a category where we still have to have high performance, but maybe the bar is not quite as high.
Why did you choose to use PyTorch?
We started this project in 2017 and the first product that was available at the time was Caffe. So the first version of the product actually used Caffe to prove out the technique. And then after that, we switched to TensorFlow for a short time and worked with TensorFlow for a season. But we found that PyTorch has really emerged as the favorite of researchers recently. People feel that it's more like pure Python, so it's a little bit easier to develop and a little bit easier to debug. There's a lot of people embracing it in the community. If we go to the papers with code website, a lot of those submissions are already in PyTorch.
From a research point of view, that's advantageous to us because now we say, "Oh, I don't have really any ramp-up time to try this out. I can just try it out." We've also found that getting models into production is pretty straightforward with PyTorch as well. Because we're working with the Nvidia Xavier on the edge computing devices, [Nvidia's] TensorRT is going to be part of our stack. No matter what training tool we use, it's going to end up as a TensorRT output. And PyTorch we find integrates pretty well with that. And we have an intermediate step where we convert to an ONNX model, and then convert down to a PyTorch model.
We found a lot of flexibility from the framework, and I think the documentation is really well done. Also, when I hire a new person and they may not have worked with PyTorch before, I find like the ramp-up time is very short. It's helpful for us to bring somebody on very quickly, and get them productive.
You've said the Nvidia module you're using on tractors is comparable in power to an IBM supercomputer from 2007. Do you think in another 13 years you'll be able to do this sort of sensing work from the cloud?
That's a really interesting thought. I guess we'd have to solve the connectivity in the field, which I think is a very hard problem for us to solve. I'm looking forward to Starlink as a potential way to address this current state of wireless connectivity in the rural U.S. Any solution we're talking about today, any solution for us that requires like a round trip to the cloud is, is basically out of the question.
However, 5G is definitely going to help this, but that's going to take a long time to roll out. And the rollout to their rural areas will probably be a little slower than to the urban areas. I feel that the technology for the next foreseeable future is gonna live on the edge, making these real-time decisions. But the huge power of our network is having a data-collection strategy where we are collecting and tagging things real time, and then uploading only the interesting things so that those things go back to our headquarters and we can include them in training. Andrew Ng calls this "the virtuous cycle of AI," where you're constantly learning from your model, constantly improving it.
The work you're doing right now is automating an extremely complicated process, but there's still a human in the loop, driving the tractor. How long do you think it will be before this entire procedure will be autonomous?
We've got some initiatives within John Deere Blue River that I can't talk about, but they are very exciting. I do think the industry is definitely moving towards automating when it makes sense. My personal opinion is when you're talking about weeding or really talking about any AI product, the customer has to embrace that technology — they have to try it out and feel good about it. And then they can talk about automating it. But if you try to jump right to automating something, I feel like you have to build that trust with the customer and you have to help them to understand how this is going to help them first. That's largely been our approach on this. I see it as a rollout where it's eventually going to get there. It is going to take some time and there's going to be some concrete steps along the way.
The way I think about this problem, a see-and-spray is going to be one of those concrete steps where there's going to be an automated system on your sprayer that's making real-time decisions and helping you save money, helping you reduce herbicide. And you're still getting a really high level of weed control. Once we have proven that out, there's different things we can do — we can make the product start to protect your crop a bit more. We can detect insect damage, we can give you a map of weed species in your field, we can maybe do things like stand count. We can build some yield predictions off of that stand count. There's a tremendous amount of things that we unlock just by having this technology. So my personal opinion is that we want to bring autonomy to the farm as it makes sense. And we start with making better decisions, optimizing these decisions and showing value in that. And then we talk about automating. So I see it as a staged process.
How much variability do you think AI can remove from nature?
We've actually done some thinking on this. So let me run a scenario by you here that I personally think is really exciting. I've talked about the see-and-spray, but sometimes it's nice to think of that as a scanner — what if you could take a scanner into your field and what if we could take that scanner in two or three times a season and then upload the results? And then what if you could then do that for a thousand machines across the U.S. and Brazil and Europe and Australia? You get this tremendously valuable dataset where now you could really make some interesting introspections right. You can actually do things like A/B testing without designing an experiment, because you can say, "This farmer in the Midwest, he had this herbicide regimen with this rate and this application frequency, and over in this other area that's very close, this person had a different herbicide, and whose yield was better?" You can start to mine these things and say, "Hey, actually we can discover and recommend the optimal farming procedures here."
We tend to think of a very Silicon Valley way of describing farming, which you'd probably only hear in Silicon Valley, which is the "farming stack." Right. What does the farming stack look like? You've got tillage, planting, weeding, harvest. What if you went in and did that scanning and optimization thing to every piece of that stack? What would you be able to do? Maybe you could make incremental percentage savings on the tillage or optimizations that might translate into optimizations on the planting that might actually materially affect your yield. Maybe you bumped your yield up by optimizing everything. And then if you look at the yield predictions over a large scale, maybe it becomes actually quite a big number. So that's where I see AI is really just collecting that data set and doing data analysis and insights off that tremendous data set. That's how we've been talking about how AI fits into farming in particular, and how this could bring value to the farmer.
And I think it's a pretty exciting concept because if you think about what companies could do this. Deere is one of the companies that could do this, because we've got such a huge installed customer base, and if we could bring technology into each of these well established practices, then I see a tremendous amount of value from that.
It inspires us a lot at Blue River, because we feel like we're working on important problems that are relevant and helpful for humanity. That actually is a little bit of my hiring pitch when I interview new engineers. People come to us because they feel a desire, a responsibility, to the rest of the planet.