The Convergence of Information Technology and the Life Sciences


We are at an incredible inflection point in the traditionally distinct fields of information technology and the life sciences. These broad disciplines are converging to advance fields of study like computational biology and systems biology, resulting in new forms of therapeutics and diagnostics that could have a monumentally positive impact on human health.

At DFJ Venture Capital, we are seeing innovations in multi-modal data generation, computational algorithms, and robotic automation, enabling companies to take on the audacious goal of curing illness and disease for everyone forever.

The gene is the atomic level of biology that encodes the information that gives us life. No kidding, it’s important. In just the last several years, there’s been an explosion of genomics data driven by the cost of sequencing falling faster on a per genome cost basis than even Moore’s Law could have predicted.

The first human genome cost $3 billion to sequence; today there are companies that can do a whole genome sequence for less than $1,000. The race of genomics innovation is unprecedented, and there are no indications that it will slow down anytime soon. It’s a big data world in the human genome.

Cost of Genome Sequencing Since 2001

Cost of Sequencing a Human Genome Since 2001 (Source: NIH)

And really, the genome is just the beginning. We have been able to expand the breadth of data from genome sequences to extensive transcriptomic, methylomic, and metabolomic data.

With this fusion of data inputs, we may be able to draw new correlations between genomic variation and human phenotypes, providing new insights into the drivers of human health at the cellular and molecular levels, and thereby increasing our understanding of metabolic diseases, psychiatric diseases, and cancer, to give us a better shot at living forever.

Deep Learning

All of this data is wonderful, but big data by itself is a big headache. We need to have a way to make sense of it. Not only are technologies for generating genomic data improving, but new computational paradigms developed from other application areas are now able to generate actionable insights from raw data inputs, including genome sequences.

Specifically, I’m talking about deep learning, which is a subset of machine learning that has improved the state-of-the-art in computer vision, natural language understanding, speech recognition, and genomics. It works with almost everything!

Deep learning allows computational models that are composed of multiple processing layers (or neurons) to learn representations of data with many levels of abstraction. It has turned out to be very good at discovering intricate structures in high-dimensional data. What that means for non-experts is that your raw genomic data goes in and wonderful insights come out the other end without you having to learn any of the math in between. I’m being a bit idealistic here, but we’ll get there eventually.

At DFJ, we have many application-level investments in deep learning, but Nervana Systems, a San Diego-based company, took on the ambitious task of building a full-stack deep learning platform on top of their own proprietary hardware to enable artificial intelligence-driven innovation across all verticals.

Nervana’s innovation on the software stack alone showed significant performance improvement over other open source frameworks like Caffe and Torch. Their proprietary hardware would take them beyond that standard by orders of magnitude.

Given the incredible results Nervana was producing, Intel (NASDAQ: INTC) acquired Nervana for over $400 million less than three years since its founding. It was the largest and fastest exit in deep learning since Google acquired DeepMind.

There is more money to be made in deep learning. The number of applications are huge–healthcare, finance, agriculture, and automotive–to name just a few. We are just at the cusp of using deep learning to disrupt the ways we can build models to drive actionable insights from data.


Data is important, computational models are important, but in biotechnology we have to leave the world of bits and venture into the scary world of atoms (i.e., the real world). Capital intensity and labor costs driven by the need for highly specialized talent have beleaguered the life sciences for decades. You would have PhD-level folks move small vials of liquid around in a lab for hours! Now we’ve developed new technologies to help automate some of the mundane, repetitive tasks that lab bench scientists have had to endure and make them more efficient.

Companies like Menlo Park, CA-based Transcriptic and South San Francisco’s Emerald Therapeutics have built remote, life science laboratories that make it possible to automate cell and molecular biology workflows from a simple web interface. Someone could soon start a biotechnology company with just a laptop with an Internet connection in a coffee shop! The Amazon Web Services of life sciences is already being built.

Another DFJ portfolio company, Emeryville-based Zymergen, built this type of automation system in-house, and created an entirely new approach to building better microbes and understanding the makeup of the natural world. Microbes are tiny chemical factories in our body that produce a ton of products you use and consume on a daily basis. You could fit millions of these bugs into a space the size of a needle eye, but just one microbe could make 2,000 distinct molecules pretty easily! Your food, your beer, and your medicines, have likely all had some contribution from these tiny chemical factories.

Zymergen is the epitome of this incredible convergence in IT and biology. They design microbial strains using computer-aided design tools that create instructions for thousands of strains; build the strains using automated high-throughput processes to edit genomes and produce thousands of them per week; test and analyze raw data streams from the strains and use algorithms to find the best ones; and learn by using machine learning and deep learning techniques with a library of microbial models that are optimized for various end products. In Silicon Valley speak, they are crushing it. These innovation processes they’ve developed help us automate part of, and optimize all of the scientific discovery process. They are already leading to new types of nutrients and to more precise therapies in the future.

Systems Biology

Biology is complex, and the biological phenomena present in our bodies are no longer going to be characterized and analyzed by point solutions that are only going to give us a myopic view of the body. Systems biology, a multi-discipline field that uses computational methods to enable new data types to solve complex biological problems, is exactly what the future will hold and what we see this convergence enabling.

Analyzing the body as a network of networks, from organs to cells to molecules to genes, will enable us to advance the state of precision medicine and healthcare. It’s an exciting time for us to be alive, because we are just at the beginning of what we can accomplish. With biotechnology accelerating faster than Moore’s Law, the compounding nature of technological progress holds enormous promise for curing illnesses and disease for everyone forever.

Mohammad Islam is a senior associate at DFJ, where he focuses on frontier technologies in machine intelligence, biotechnology, healthcare, and next-generation infrastructure. Follow @@mohammadiislam

Trending on Xconomy