The Arrival of Nanopore Sequencing

(Update 1 March: Thanks to the anonymous commenter who pointed out the throughput estimates for existing instruments were too low.)

You may have heard a little bit of noise about nanopore sequencing in recent weeks.  After many years of development, Oxford Nanopore promises that by the end of the year we will be able to read DNA sequences by threading them through the eye of a very small needle.

How It Works: Directly Reading DNA

The basic idea is not new: as a long string of DNA pass through a small hole, its components -- the bases A, T, G, and C -- plug that hole to varying degrees.  As they pass through the hole, in this case an engineered pore protein derived from one found in nature, each base has slightly different interactions with the walls of the pore.  As a result, while passing through the pore each base lets different numbers of salt ions through, which allows one to distinguish between the bases by measuring changes in electrical current.  Because this method is a direct physical interrogation of the chemical structure of each base, it is in principal much, much faster than any of the indirect sequencing technologies that have come before.

There have been a variety of hurdles to clear to get nanopore sequencing working.  First you have to use a pore that is small enough to produce measurable changes in current.  Next the speed of the DNA must be carefully controlled so that the signal to noise ratio is high enough.  The pore must also sit in an insulating membrane of some sort, surrounded by the necessary electrical circuitry, and to become a useful product the whole thing must be easily assembled in an industrial manner and be mechanically stable through shipping and use.

Oxford Nanopore claims to have solved all those problems.  They recently showed off a disposable version of their technology -- called the MinIon -- containing 512 pores built into a disposable USB stick.  This puts to shame the Lava Amp, my own experiment with building a USB peripheral for molecular biology.  Here is one part I find extremely impressive -- so impressive it is almost hard to believe: Oxford claims they have reduced the sample handling to single (?) pipetting step.  Clive Brown, Oxford CTO, says "Your fluidics is a Gilson."  (A "Gilson" would be a brand of pipetter.)  That would be quite something.

I've spent a good deal of my career trying to develop simple ways of putting biological samples into microfluidic doo-dads of one kind or another.  It's never trivial, it's usually a pain in the ass, and sometimes it's a showstopper.  Blood, in particular, is very hard to work with.  If Oxford has made this part of the operation simple, then they have a winning technology just based on everyday ease of use -- what sometimes goes by the labels of "user experience" or "human factors".  Compared to the complexity of many other laboratory protocols, it would be like suddenly switching from MS DOS to OS X in one step.

How Well Does it Work?

The challenge for fast sequencing is to combine throughput (bases per hour) with read length (the number of contiguous bases read in one go).  Existing instruments have throughputs in the range of 10-55,000 megabases/day and read lengths from tens of bases to about 800 bases.  (See chart below.)  Nick Loman reports that using the MinIon Oxford has already run DNA of 5000 to 100,000 bases (5 kB to 100 kB) at speeds of 120-1000 bases per minute per pore, though accuracy suffers above 500 bases per minute.  So a single USB stick can run easily run at 150 megabases (MB) per hour, which basically means you can sequence full-length eukaryotic chromosomes in about an hour.  Over the next year or so, Oxford will release the GridIon instrument that will have 4 and then 16 times as many pores.  Presumably that means it will be 16 times as fast.  The long read lengths mean that processing the resulting sequence data, which usually takes longer than the actual sequencing itself, will be much, much faster.

This is so far beyond existing commercial instruments that it sounds like magic.  Writing in Forbes, Matthew Herper quotes Jonathan Rothberg, of sequencing competitor Ion Torrent, as saying "With no data release how do you know this is not cold fusion? ... I don't believe it."  Oxford CTO Clive Brown responded to Rothberg in the comments to Herper's post in a very reasonable fashion -- have a look.

Of course I want to see data as much as the next fellow, and I will have to hold one of those USB sequencers in my own hands before I truly believe it.  Rothberg would probably complain that I have already put Oxford on the "performance tradeoffs" chart before they've shipped any instruments.  But given what I know about building instruments, I think immediately putting Oxford in the same bin as cold fusion is unnecessary.

Below is a performance comparison of sequencing instruments originally published by Bio-era in Genome Synthesis and Design Futures in 2007.  (Click on it for a bigger version.)  I've hacked it up to include the approximate performance range of 2nd generation sequencers from Life, Illumina, etc, as well for a single MinIon.  That's one USB stick, with what we're told is a few minutes worth of sample prep.  How many can you run at once?  Notice the scale on the x-axis, and the units on the y-axis.  If it works as promised, the MinIon is so vastly better than existing machines that the comparison is hard to make.  If I replotted that data with log axis along the bottom then all the other technologies would be cramped up together way off to the left. (The data comes from my 2003 paper, The Pace and Proliferation of Biological Technologies (PDF), and from Service, 2006, The Race for the $1000 Genome).
 
Carlson_sequencer_performanc_2012.png The Broader Impact

Later this week I will try to add the new technologies to the productivity curve published in the 2003 paper.  Here's what it will show: biological technologies are improving at exceptional paces, leaving Moore's Law behind.  This is no surprise, because while biology is getting cheaper and faster, the density of transistors on chips is set by very long term trends in finance and by SEMATECH; designing and fabricating new semiconductors is crazy expensive and requires coordination across an entire industry. (See The Origin of Moore's Law and What it May (Not) Teach Us About Biological Technologies.)  In fact, we should expect biology to move much faster than semiconductors. 

Here are a few graphs from the 2003 paper:

...The long term distribution and development of biological technology is likely to be largely unconstrained by economic considerations. While Moore's Law is a forecast based on understandable large capital costs and projected improvements in existing technologies, which to a great extent determined its remarkably constant behavior, current progress in biology is exemplified by successive shifts to new technologies. These technologies share the common scientific inheritance of molecular biology, but in general their implementations as tools emerge independently and have independent scientific and economic impacts. For example, the advent of gene expression chips spawned a new industrial segment with significant market value. Recombinant DNA, gel and capillary sequencing, and monoclonal antibodies have produced similar results. And while the cost of chip fabs has reached upwards of one billion dollars per facility and is expected to increase [2012 update: it's now north of $6 billion], there is good reason to expect that the cost of biological manufacturing and sequencing will only decrease. [Update 2012: See "New Cost Curves" for DNA synthesis and sequencing.]

These trends--successive shifts to new technologies and increased capability at decreased cost--are likely to continue. In the fifteen years that commercial sequencers have been available, the technology has progressed ... from labor intensive gel slab based instruments, through highly automated capillary electrophoresis based machines, to the partially enzymatic Pyrosequencing process. These techniques are based on chemical analysis of many copies of a given sequence. New technologies under development are aimed at directly reading one copy at a time by directly measuring physical properties of molecules, with a goal of rapidly reading genomes of individual cells.  While physically-based sequencing techniques have historically faced technical difficulties inherent in working with individual molecules, an expanding variety of measurement techniques applied to biological systems will likely yield methods capable of rapid direct sequencing.

Cue nanopore sequencing. 

A few months ago I tweeted that I had seen single strand DNA sequence data generated using a nanopore -- it wasn't from Oxford. (Drat, can't find the tweet now.)  I am certain there are other labs out there making similar progress.  On the commercial front, Illumina is an investor in Oxford, and Life has invested in Genia.  As best I can tell, once you get past the original pore sequencing IP, which it appears is being licensed broadly, there appear to be many measurement approaches, many pores, and many membranes that could be integrated into a device.  In other words, money and time will be the primary barriers to entry.

(For the instrumentation geeks out there, because the pore is larger than a single base, the instrument actually measures the current as three bases pass through the pore.  Thus you need to be able to distinguish 4^3=64 levels of current, which Oxford claims they can do.  The pore set-up I saw in person worked the same way, so I certainly believe this is feasible.  Better pores and better electronics might reduce the physical sampling to 1 or 2 bases eventually, which should result in faster instruments.)

It may be that Oxford will have a first mover advantage for nanopore instruments, and it may be that they have amassed sufficient additional IP to make it rough for competitors.  But, given the power of the technology, the size of the market, and the number of academic competitors, I can't see that over the long term this remains a one-company game.

Not every sequencing task has the same technical requirements, so instruments like the Ion Torrent won't be put to the curbside.  And other technologies will undoubtedly come along that perform better in some crucial way than Oxford's nanopores.  We really are just at the beginning of the revolution in biological technologies.  Recombinant DNA isn't even 40 years old, and the electronics necessary for nanopore measurements only became inexpensive and commonplace in the last few years.  However impressive nanopore sequencing seems today, the greatest change is yet to come.