Is This Virtual Worm the First Sign of the Singularity?
For all the talk of artificial intelligence and all the games of SimCity that have been played, no one in the world can actually simulate living things. Biology is so complex that nowhere on Earth is there a comprehensive model of even a single simple bacterial cell.
And yet, these are exciting times for "executable biology," an emerging field dedicated to creating models of organisms that run on a computer. Last year, Markus Covert's Stanford lab created the best ever molecular model of a very simple cell. To do so, they had to compile information from 900 scientific publications. An editorial that accompanied the study in the journal Cell was titled, "The Dawn of Virtual Cell Biology."
In January of this year, the one-billion euro Human Brain Project received a decade's worth of backing from the European Union to simulate a human brain in a supercomputer. It joins Blue Brain, an eight-year-old collaboration between IBM and the Swiss Federal Institute of Technology in Lausanne, in this quest. In an optimistic moment in 2009, Blue Brain's director claimed such a model was possible by 2019. And last month, President Obama unveiled a $100 million BRAIN Initiative to give "scientists the tools they need to get a dynamic picture of the brain in action." An entire field, connectomics, has emerged to create wiring diagrams of the connections between neurons ("connectomes"), which is a necessary first step in building a realistic simulation of a nervous system. In short, brains are hot, especially efforts to model them in silico.
But in between the cell-on-silicon and the brain-on-silicon simulators lies a fascinating and strange new project to create a life-like simulation of Caenohabditis elegans, a roundworm. OpenWorm isn't like these other initiatives; it's a scrappy, open-source project that began with a tweet and that's coordinated on Google Hangouts by scientists spread from San Diego to Russia. If it succeeds, it will have created a first in executable biology: a simulated animal using the principles of life to exist on a computer.
"If you're going to understand a nervous system or, more humbly, how a neural circuit works, you can look at it and stick electrodes in it and find out what kind of receptor or transmitter it has," said John White, who built the first map of C. elegans's neural anatomy, and recently started contributing to the project. "But until you can quantify and put the whole thing into a computer and simulate it and show your computer model can behave in the same way as the real one, I don't think you can say you understand it."
For example, when researchers touch a worm on the head and it responds by turning and moving backwards, what exactly is happening there? What molecular mechanisms coordinate the firing of neural networks that initiate and complete this complex behavior? This month, a paper came out in PLOS Biology describing that exact sequence as recorded in live C. elegans. But it's one of very few studies like that.
More broadly, OpenWorm raises fascinating questions about what we mean when we say something is alive. If and when this project succeeds in modeling the worm successfully, we'll be faced with a new and fascinating concept to think with: a virtual organism. Imagine downloading the worm and running it in a virtual petri dish on your computer. What, exactly, will you be looking at? Will you consider it to be alive? What would convince you?
Perhaps creations like the digital C. elegans will start to break down our binary conception of the matter in the world as either living or not living. We'll discover that we can create systems that exist in-between these two spheres, or that certain aspects of life as we know it are not required to meet our definition of being alive.
"I suspect that we'll recognize that living systems are far-from-equilibrium molecular systems that are carrying out very specific sophisticated physical patterns and have some ability to sustain themselves over time," OpenWorm's organizer Stephen Larson wrote to me. "Thinking about it that way makes me go beyond a black and white notion of 'alive' to a more functional perspective -- living systems are those which self sustain. Our goal is to aggregate more of the biological processes we know that help the worm to self-sustain than have ever been aggregated before, and to measure how close our predictions of behavior match real living behavior, more than it is to shoot for some pre-conceived notion of how much 'aliveness' we need.
"It's a complex, ambitious project, to say the least. White called it "bold." Yet it all began with a tweet.
In early 2010, software engineer Giovanni Idili sent a tweet to the Twitter account for The Whole Brain Catalog, a project to bring mouse brain data together into more usable formats. He said, as if on a lark, "@braincatalog new year's resolution: simulate the whole C. Elegans brain (302 neurons)!" One of the Brain Catalog's founders, Stephen Larson, was scanning the @-replies and offered his assistance, "So, do you want any help with that? How are you going to do it?"
Beginning with a 1997 proposal at the University of Oregon, there have been several attempts to simulate worms. Some focused on the body alone. Others tried to simulate the worm's behavior through machine learning, with no attempt at a biologically realistic nervous system. Idilli and Larson wanted to go beyond these early efforts. When Larson was at MIT, he was influenced by Rodney Brooks, the director of the Computer Science and Artificial Intelligence Laboratory at the university (and the creator of the Roomba!). Brooks proposed the idea that if you want artificial intelligence, it should be situated within an environment. Is his 1990 paper, "Elephants don't play chess," he argued that "to build a system that is intelligent it is necessary to have its representations grounded in the physical world."
The great thing about C. elegans, though, is that its physical world in the laboratory is completely standardized and well known. The worms live in petri dishes with agar. If any environment can be modeled by a computer, it is a petri dish with agar. The nascent OpenWorm team could build a realistic virtual environment for a digital C. elegans.
Which meant that their little worm brain -- the target of Idili's initial suggestion -- needed a body. For that, they reached out to Christian Grove at CalTech, who donated a 3D atlas of the worm to get them started.
They had a map of the brain, a model of the body, and a pretty good idea of how to build the environment. Their artificial intelligence might not be embodied, but it would be "situated." The brain would direct the body and the body would interact with the environment, and all three pieces would be connected by the intricate feedback loops that permeate biology.
Their goal became clear: they should build, as they put it on the website, "a fully digital lifeform -- a virtual nematode -- in a completely open source manner."
Three years and 31 Google Hangouts later, OpenWorm is a going concern with Larson at the helm and a team spread across the continents. Alexander Dibert, Sergey Khayrulin, and Andrey Palyanov contribute software development from Russia, along with Matteo Cantarelli in the UK and Timothy Busbice in California. Neuroscientists Mike Vella and Padraig Gleeson are stationed at Cambridge and University College London, respectively. And of course, Idili in Ireland and Larson in San Diego. There is no central lab, nor could there be.
The OpenWorm team has broken down this immense task into five component systems. First, at the base of the project, they have a list of the 959 cells in the C. elegans body. The list includes a rough idea of what each of the cells does, thanks to decades of research on the worm. Then, they've got a life simulation engine they call Geppetto (shout out to Pinocchio!), which is the platform on which all the other software runs. Third, there is the simulated physical body. They are creating an algorithm for worm mechanics that can generate realistic muscle movements. Fourth, they have an electrical model for the muscles. What are the signals that they send and receive to move the animal? Last but not least, they must animate the connectome, the wiring diagram for the worm's nervous system.
Their team has been making steady progress, but being at the leading edge means that they're also at the leading edge of encountering the problems that any effort to simulate a brain is going to have.
For an outsider and non-biologist, simulating the C. elegans brain seems like it should be relatively easy. You've got the map of the neurons. You know where all the cells go in the body of the worm. You know how it behaves under all these experimental conditions. What's so hard about simulating its behavior?
Basically, everything.
We don't know how to simulate every single protein and nucleic acid in a cell. And even if we could, it would be computationally staggering to try to model each and every cell in the worm down to that atomic level, figuring out each and every molecular interaction inside these densely packed cells. No experiments can output that data.
You could eschew biological realism entirely. It would be relatively trivial to create a CGI worm that *looked* realistic. Perhaps one could make it behave realistically by running machine learning on worm behavioral data in particular situations. But that wouldn't be a very interesting simulation of the processes of life. It certainly wouldn't be a model that would help biologists much.
So, between realistically simulating every atom and realistically simulating nothing, OpenWorm has had to make some tradeoffs. Larson thinks about it like this. Imagine a graph. Along the X-axis, you've got the level of biological realism baked into the simulation. Do its cells do what real cells do? Which parts of the cells do what their biological counterparts do? Do the neurons work like biological neurons? And the along the Y-axis, you've got the behavioral realism. Does this thing do wiggle like a real worm? Does it respond to chemicals like a real worm? Does it attempt to and succeed in reproducing?
The problem is, as Larson explains, "we don't know how far you have to go to the right on the X-axis to go [a certain amount] up on the y-axis." They don't know what level of biological realism will get them to what level of behavioral realism.
And, buried in that question is a deeper one: When can we say, or scream, raising our twisted fingers to the sky as lightning flashes above, "It's alive!"?
For example, they are using a model of how neurons work called the Hodgkin-Huxley model, which garnered its creators a Nobel Prize. If they were to add more detailed simulations of the neurons, would that meaningfully add to the behavioral realism of the organism as a whole? Or can the principles of neuronal firing and propagation be abstracted from their biological embodiment without losing any behavioral fidelity?
Making decisions about these tradeoffs forms the core of the project. All biological simulation projects to date have faced similar challenges. Take the now defunct Canadian project called (cue techno!) Project CyberCell.
Led by Michael Ellison of the University of Alberta, the team wanted to create a simple E. coli simulation. The molecules inside cells form these fantastically complex structures that are constantly moving around and changing shape. Modeling all that takes enormous computational horsepower, and that's assuming you know exactly how each protein is going to fold. It was too much to attempt. So, instead, CyberCell represented each molecule as a sphere -- "Every ribosome, every lipid molecule, every metabolite" Ellison said -- of approximately the right size. Then, they simply assigned each sphere certain probabilities of reacting with other spheres. "If the right enzyme connects with the right small molecule, there was a certain probability that a chemical reaction may take place," he explained.
Is that realistic? Not really. But it made it possible to start experimenting. "We still don't know enough about the living organisms," Ellison told me. "50 percent of E. coli is still a blackbox."
That figure might be even larger for C. elegans, but it's still the best characterized animal that researchers have got. It remains the only organism for which a complete connectome actually exists. Working in Nobel laureate Sydney Brenner's Laboratory of Molecular Biology in Cambridge during the 1970s, White and his team spent 13 years creating the wiring diagram. Electron microscopist Nichol Thomson cut the one-milimeter worms into 20,000 very thin slices, which -- because the worms are transparent -- he could then image with his microscope. "The thing that gave [Thomson] the biggest pleasure of all was to cut a long series of quality images," White told me.
Then, with White's direction, a technician named Eileen Southgate painstakingly labeled each nerve cell and connection in the micrographs. Through their work, they discovered C. elegans has 302 neurons that form approximately 10,000 connections. And Southgate traced each and every one. "I found out several years into her collaboration that as a hobby, she put huge jigsaw puzzles together," White recalled. "She has a wonderful visual memory." She began work at the lab when she was 16 years old and stayed until she retired.
The brain map was only one of several scientific feats accomplished with C. elegans. The worm was also the first multicellular organism to have its genome sequenced. And scientists precisely tracked its development from embryo to adulthood. There's even a database (WormBase) that contains more complete data about the organism's functioning at the molecular level than one could find for any other animal. Dozens of labs work with this little species.
Brenner handpicked the organism precisely for its amenability to study, calling the worm "nature's gift to science." University of Kansas worm biologist, Brian Ackley, likes to joke that Brenner created C. elegans in a lab "because he was tired of working on things that didn't have perfect biological criteria." They're tiny, transparent, reproduce quickly, have a small number of neurons, and each body is composed of exactly 959 total cells.
"Brenner planned to use the worm to discover how genes made bodies and then behavior," wrote Andrew Brown in a book on C. elegans. "And this was in 1965, before anyone had found and analysed a single gene for anything." It is only today, in 2013, that his disciples' disciples' are beginning to fulfill that original vision.
In a 1974 paper quoted in the talk he gave accepting the Nobel Prize for Medicine, Brenner put it like this, "Behavior is the result of a complex ill-understood set of computations performed by nervous systems and it seems essential to decompose the question into two," he wrote, "one concerned with the question of the genetic speciļ¬cation of nervous systems and the other with the way nervous systems work to produce behaviour." In other words, how do genes build brains and how do brains direct bodies?
Now, finally, OpenWorm may be able to integrate the strains of research that began with Brenner into one simulation that, as it wiggles along in its digital petri dish, might be the first realistic virtual animal, a boon to research, and a Kurzweilian foreshadowing of the challenges humans face when we begin running life on silicon chips.
I asked several researchers whether simulating the worm was possible. "It's really a difficult thing to say whether it's possible," said Steven Cook, a graduate student at Yale who has worked on C. elegans connectomics. But, he admitted, "I'm optimistic that if we're starting with 302 neurons and 10,000 synapses we'll be able to understand its behavior from a modeling perspective." And, in any case, "If we can't model a worm, I don't know how we can model a human, monkey, or cat brain."
Ellison echoed that thought. "They stand a much better chance of success than the people working on mammalian brains," he said. White, who led the creation of the worm connectome, said OpenWorm "seemed appropriate really" as a way of integrating all the data that biologists were producing. And the Kansas worm scientist Ackley figured that even if OpenWorm didn't work, something like it would. "C. elegans is probably going to be the first or very close to the first [multicellular organism] to be simulated," he said
David Dalrymple, an MIT graduate student who has contributed to OpenWorm and is working on a worm brain modeling project of his own, pointed out what he sees as a limitation to the effort. OpenWorm has incorporated a lot of anatomical data -- the structures of the worm's nervous system and musculature -- described by scientists like White. But these studies were carried out with dead worms. They can't tell scientists about the relative importance of connections between neurons within the worm's neural system, only that a connection exists. Very little data from living animals' cells exist in the published literature, and it may be required to develop a good simulation.
"I believe that an accurate model requires a great deal of functional data that has not yet been collected, because it requires a kind of experiment that has only become feasible in the last year or two," Dalrymple told me in an email. His own research is to build an automated experimental apparatus that can gather up that functional data, which can then be fed into these models. "We're coming at the problem from different directions," he said. "Hopefully, at some point in the future, we'll meet in the middle and save each other a couple years of extra work to complete the story."