Here are updated cost and productivity curves for DNA sequencing and synthesis. Reading and writing DNA is becoming ever cheaper and easier. The Economist and others call these "Carlson Curves", a name I am ambivalent about but have come to accept if only for the good advertising. I've been meaning to post updates for a few weeks; the appearance today of an opinion piece at Wired about Moore's Law serves as a catalyst to launch them into the world. In particular, two points need some attention, the notions that Moore's Law 1) is unplanned and unpredictable, and 2) somehow represents the maximum pace of technological innovation.
DNA Sequencing Productivity is Skyrocketing
First up: the productivity curve. Readers new to these metrics might want to have a look at my first paper on the subject, "The Pace and Proliferation of Biological Technologies" (PDF) from 2003, which describes why I chose to compare the productivity enabled by commercially available sequencing and synthesis instruments to Moore's Law. (Briefly, Moore's Law is a proxy for productivity; more transistors putatively means more stuff gets done.) You have to choose some sort of metric when making comparisons across such widely different technologies, and, however much I hunt around for something better, productivity always emerges at the top.
It's been a few years since I updated this chart. The primary reason for the delay is that, with the profusion of different sequencing platforms, it became somewhat difficult to compare productivity [bases/person/day] across platforms. Fortunately, a number of papers have come out recently that either directly make that calculation or provide enough information for me to make an estimate. (I will publish a full bibliography in a paper later this year. For now, this blog post serves as the primary citation for the figure below.)
Visual inspection reveals a number of interesting things. First, the DNA synthesis productivity line stops in about 2008 because there have been no new instruments released publicly since then. New synthesis and assembly technologies are under development by at least two firms, which have announced they will run centralized foundries and not sell instruments. More on this later.
Second, it is clear that DNA sequencing platforms are improving very rapidly, now much faster than Moore's Law. This is interesting in itself, but I point it out here because of the post today at Wired by Pixar co-founder Alvy Ray Smith, "How Pixar Used Moore's Law to Predict the Future". Smith suggests that "Moore's Law reflects the top rate at which humans can innovate. If we could proceed faster, we would," and that "Hardly anyone can see across even the next crank of the Moore's Law clock."
Moore's Law is a Business Model and is All About Planning -- Theirs and Yours
As I have written previously, early on at Intel it was recognized that Moore's Law is a business model (see the Pace and Proliferation paper, my book, and in a previous post, "The Origin of Moore's Law"). Moore's Law was always about economics and planning in a multi-billion dollar industry. When I started writing about all this in 2000, a new chip fab cost about $1 billion. Now, according to The Economist, Intel estimates a new chip fab costs about $10 billion. (There is probably another Law to be named here, something about exponential increases in cost of semiconductor processing as an inverse function of feature size. Update: This turns out to be Rock's Law.) Nobody spends $10 billion without a great deal of planning, and in particular nobody borrows that much from banks or other financial institutions without demonstrating a long-term plan to pay off the loan. Moreover, Intel has had to coordinate the manufacturing and delivery of very expensive, very complex semiconductor processing instruments made by other companies. Thus Intel's planning cycle explicitly extends many years into the future; the company sees not just the next crank of the Moore's Law clock, but several cranks. New technology has certainly been required to achieve these planning goals, but that is just part of the research, development, and design process for Intel. What is clear from comments by Carver Mead and others is that even if the path was unclear at times, the industry was confident that they could to get to the next crank of the clock.
Moore's Law served a second purpose for Intel, and one that is less well recognized but arguably more important; Moore's Law was a pace selected to enable Intel to win. That is why Andy Grove ran around Intel pushing for financial scale (see "The Origin of Moore's Law"). I have more historical work to do here, but it is pretty clear that Intel successfully organized an entire industry to move at a pace only it could survive. And only Intel did survive. Yes, there are competitors in specialty chips and in memory or GPUs, but as far as high volume, general CPUs go, Intel is the last man standing. Finally, and alas I don't have a source anywhere for this other than hearsay, Intel could have in fact gone faster than Moore's Law. Here is the hearsay: Gordon Moore told Danny Hillis who told me that Intel could have gone faster. (If anybody has a better source for that particular point, give me a yell on Twitter.) The inescapable conclusion from all this is that the management of Intel made a very careful calculation. They evaluated product roll-outs to consumers, the rate of new product adoption, the rate of semiconductor processing improvements, and the financial requirements for building the next chip fab line, and then set a pace that nobody else could match but that left Intel plenty of headroom for future products. It was all about planning.
The reason I bother to point all this out is that Pixar was able to use Moore's Law to "predict the future" precisely because Intel meticulously planned that future. (Calling Alan Kay: "The best way to predict the future is to invent it.") Which brings us back to biology. Whereas Moore's Law is all about Intel and photolithography, the reason that productivity in DNA sequencing is going through the roof is competition among not just companies but among technologies. And we only just getting started. As Smith writes in his Wired piece, Moore's Law tells you that "Everything good about computers gets an order of magnitude better every five years." Which is great: it enabled other industries and companies to plan in the same way Pixar did. But Moore's Law doesn't tell you anything about any other technology, because Moore's Law was about building a monopoly atop an extremely narrow technology base. In contrast, there are many different DNA sequencing technologies emerging because many different entrepreneurs and companies are inventing the future.
The first consequence of all this competition and invention is that it makes my job of predicting the future very difficult. This emphasizes the difference between Moore's Law and Carlson Curves (it still feels so weird to write my own name like that): whereas Intel and the semiconductor industry were meeting planning goals, I am simply keeping track of data. There is no real industry-wide planning in DNA synthesis or sequencing, other than a race to get to the "$1000 genome" before the next guy. (Yes, there is a vague road-mappy thing promoted by the NIH that accompanied some of its grant programs, but there is little if any coordination because there is intense competition.)
Biological Technologies are Hard to Predict in Part Because They Are Cheaper than Chips
Compared to other industries, the barrier to entry in biological technologies is pretty low. Unlike chip fabs, there is nothing in biology that costs $10 billion commercially, nor even $1 billion. (I have come to mostly disbelieve pharma industry claims that developing drugs is actually that expensive, but that is another story for another time.) The Boeing 787 reportedly cost $32 billion to develop as of 2011, and that is on top of a century of multi-billion dollar aviation projects that had to come before the 787.
There are two kinds of costs that are important to distinguish here. The first is the cost of developing and commercializing a particular product. Based on the money reportedly raised and spent by Life, Illumina, Ion Torrent (before acquisition), Pacific Biosciences, Complete Genomics (before acquisition), and others, it looks like developing and marketing second-generation sequencing technology can cost upwards of about $100 million. Even more money gets spent, and lost, in operations before anybody is in the black. My intuition says that the development costs are probably falling as sequencing starts to rely more on other technology bases, for example semiconductor processing and sensor technology, but I don't know of any real data. I would also guess that nanopore sequencing, should it actually become a commercial product this year, will have cost less to develop than other technologies, but, again, that is my intuition based on my time in clean rooms and at the wet bench. I don't think there is great information yet here, so I will suspend discussion for the time being.
The second kind of cost to keep in mind is the use of new technologies to get something done. Which brings in the cost curve. Again, the forthcoming paper will contain appropriate references.
The cost per base of DNA sequencing has clearly plummeted lately. I don't think there is much to be made of the apparent slow-down in the last couple of years. The NIH version of this plot has more fine grained data, and it also directly compares the cost of sequencing with the cost per megabyte for memory, another form of Moore's Law. Both my productivity plot above and the NIH plot show that sequencing has at times improved much faster than Moore's Law, and generally no slower.
If you ponder the various wiggles, it may be true that the fall in sequencing cost is returning to a slower pace after a period in which new technologies dramatically changed the market. Time will tell. (The wiggles certainly make prediction difficult.) One feature of the rapid fall in sequencing costs is that it makes the slow-down in synthesis look smaller; see this earlier post for different scale plots and a discussion of the evaporating maximum profit margin for long, double-stranded synthetic DNA (the difference between the orange and yellow lines above).
Whereas competition among companies and technologies is driving down sequencing costs, the lack of competition among synthesis companies has contributed to a stagnation in price decreases. I've covered this in previous posts (and in this Nature Biotech article), but it boils down to the fact that synthetic DNA has become a commodity produced using relatively old technology.
Where Are We Headed?
Now, after concluding that the structure of the industry makes it hard to prognosticate, I must of course prognosticate. In DNA sequencing, all hell is breaking loose, and that is great for the user. Whether instrument developers thrive is another matter entirely. As usual with start-ups and disruptive technologies, surviving first contact with the market is all about execution. I'll have an additional post soon on how DNA sequencing performance has changed over the years, and what the launch of nanopore sequencing might mean.
DNA synthesis may also see some change soon. The industry as it exists today is based on chemistry that is several decades old. The common implementation of that chemistry has heretofore set a floor on the cost of short and long synthetic DNA, and in particular the cost of synthetic genes. However, at least two companies are claiming to have technology that facilitates busting through that cost floor by enabling the use of smaller amounts of poorer quality, and thus less expensive, synthetic DNA to build synthetic genes and chromosomes.
Gen9 is already on the market with synthetic genes selling for something like $.07 per base. I am not aware of published cost estimates for production, other than the CEO claiming it will soon drop by orders of magnitude. Cambrian Genomics has a related technology and its CEO suggests costs will immediately fall by 5 orders of magnitude. Of course, neither company is likely to drop prices so far at the beginning, but rather will set prices to undercut existing companies and grab market share. Assuming Gen9 and Cambrian don't collude on pricing, and assuming the technologies work as they expect, the existence of competition should lead to substantially lower prices on genes and chromosomes within the year. We will have to see how things actually work in the market. Finally, Synthetic Genomics has announced it will collaborate with IDT to sell synthetic genes, but as far as I am aware nothing new is actually shipping yet, nor have they announced pricing.
So, supposedly we are soon going to have lots more, lots cheaper DNA. But you have to ask yourself who is going to use all this DNA, and for what. The important business point here is that both Gen9 and Cambrian Genomics are working on the hypothesis that demand will increase markedly (by orders of magnitude) as the price falls. Yet nobody can design a synthetic genetic circuit with more than a handful of components at the moment, which is something of a bottleneck on demand. Another option is that customers will do less up-front predictive design and instead do more screening of variants. This is how Amyris works -- despite their other difficulties, Amyris does have a truly impressive metabolic screening operation -- and there are several start-ups planning to provide similar (or even improved) high-throughput screening services for libraries of metabolic pathways. I infer this is the strategy at Synthetic Genomics as well. This all may work out well for both customers and DNA synthesis providers. Again, I think people are working on an implicit hypothesis of radically increased demand, and it would be better to make the hypothesis explicit in part to identify the risk of getting it wrong. As Naveen Jain says, successful entrepreneurs are good at eliminating risk, and I worry a bit that the new DNA synthesis companies are not paying enough attention on this point.
There are relatively simple scaling calculations that will determine the health of the industry. Intel knew that it could grow financially in the context of exponentially falling transistor costs by shipping exponentially more transistors every quarter -- that is the business model of Moore's Law. Customers and developers could plan product capabilities, just as Pixar did, knowing that Moore's Law was likely to hold for years to come. But that was in the context of an effective pricing monopoly. The question for synthetic gene companies is whether the market will grow fast enough to provide adequate revenues when prices fall due to competition. To keep revenues up, they will then have to ship lots of bases, probably orders of magnitudes more bases. If prices don't fall, then something screwy is happening. If prices do fall, they are likely to fall quickly as companies battle for market share. It seems like another inevitable race to the bottom. Probably good for the consumer; probably bad for the producer.
(Updated) Ultimately, for a new wave of DNA synthesis companies to be successful, they have to provide the customer something of value. I suspect there will be plenty of academic customers for cheaper genes. However, I am not so sure about commercial uptake. Here's why: DNA is always going to be a small cost of developing a product, and it isn't obvious making that small cost even cheaper helps your average corporate lab.
In general, the R part of R&D only accounts for 1-10% of the cost of the final product. The vast majority of development costs are in polishing up the product into something customers will actually buy. If those costs are in the neighborhood of $50-100 million, the reducing the cost of synthetic DNA from $50,000 to $500 is nice, but the corporate scientist-customer is more worried about knocking a factor of two, or an order of magnitude, off the $50 million. This means that in order to make a big impact (and presumably to increase demand adequately) radically cheaper DNA must be coupled to innovations that reduce the rest of the product development costs. As suggested above, forward design of complex circuits is not going to be adequate innovation any time soon. The way out here may be high-throughput screening operations that enable testing many variant pathways simultaneously. But note that this is not just another hypothesis about how the immediate future of engineering biology will change, but another unacknowledged hypothesis. It might turn out to be wrong.
The upshot, just as I wrote in 2003, is that the market dynamics of biological technologies will remain difficult to predict precisely because of the diversity of technology and the difficulty of the tasks at hand. We can plan on prices going down; how much, I wouldn't want to predict.