A Few Thoughts on Rapid Genome Sequencing and The Archon Prize

The December, 2006 issue of The Scientist has an interesting article on new sequencing technologies.  "The Human Genome Project +5", by Victor McElheny, contains a few choice quotes.  Phil Sharp, from MIT, says he, "would bet on it without a questionthat we will be at a $1,000 genome in a five-year window."  Presently we are at about US$10 million per genome, so we have a ways to go. It's interesting to see just how much technology has to change before we get there. 

The Archon X-Prize for Genomics specifies sequencing 100 duplex genomes in 10 days, at a cost of no more than US$10,000 per genome.  In other words, that is roughly 600 billion bases at a cost of microdollars per base.  Looking at it yet another way, winning requires 6000 person-days at present productivity numbers for commercially available instruments, whereas 10 days only provides 30 person-days of round-the-clock productivity.

I tried to find a breakdown of genome sequencing costs on the web, and all I could come up with is an estimate for the maize genome published in 2001.  I'll use that as a cost model for state of the art sequencing of eukaryotes (using Sanger sequencing on capillary based instruments).  Bennetzen, et al., recount the "National Science Foundation-Sponsored Workshop Report: Maize Genome Sequencing Project" in the journal Plant Physiology, and report:

The participants concurred that the goal of sequencing all of the genes in the maize genome and placing these on the integrated physical and genetic map could be pursued by a combination of technologies that would cost about $52 million. The breakdown of estimated costs would be:

  • Library construction and evaluation, $3 million
  • BAC-end sequencing, $4 million
  • 10-Fold redundant sequencing of the gene-rich and low-copy-number regions, $34 million
  • Locating all of the genes on an integrated physical-genetic map, $8 million
  • Establishing a comprehensive database system, $3 million.

From the text, it seems that decreases in costs are built into the estimate.  If we chuck out the database system, since this is already built for humans and other species, we are down to direct costs of something like $49 million for approximately 2.5 megabases(MB).  The Archon prize doesn't specify whether competitors can use existing chromosomal maps to assemble sequence data, so presumably all the information is fair game.  That lets us toss out another $8 million in cost.  The 10-fold redundant sequencing is probably overkill at this point, but I will keep all those costs because the Archon prize requires an error rate of no more than 1 in 100,000 bases; you have to beat down the error regardless of the sequencing method.  Rounding down to $40 million for charity's sake, it looks like the labor and processing associated with producing the short overlapping sequences necessary for Sanger sequencing account for about 17.5 percent of the total.  These costs are probably fixed for approaches that employ shotgun sequencing.

Again using the Archon prize as a simple comparison, that's US$1.75 million just to spend on labor for getting ready to do the actual sequencing.  In 1998, the FTE (full time equivalent) for sequencing labor was US$135,000.  If you assume the dominant cost for preparing the library and verifying the BACs is labor, you can hire about 15 people.  This looks like a lot of work for 15 people, and, given the amount of time required to do all the cloning and wait for bacteria to grow, not something they can accomplish even within the 10 days alloted for the whole project.

The other 82.5 percent of the $10 million you can spend on the actual sequencing.  The prize guidelines say you don't have to include the price of the instruments in the cost, but just for the sake of argument I'll do that here.  And I'll mix and match the cost estimates from the maize project for Sanger sequencing with other technologies.  The most promising commercial instrument appears to be the 454 pyrosequencer, at $500,000 a pop, looking at its combination of read length and throughput, even if they don't yet have the read length quite high enough yet.  If you buy 16 of those beasties, it appears you can sequence about 1.6 GB a day, about a factor of 40 below what's required to win the Archon prize.  Let's say 454 gets the read length up to 500 bases, then they are still an order of magnitude shy just on the sequencing rate, forgetting the sample prep.

Alternatively, you could simply buy 600 of the 454 instruments, and then you'd be set, at least for throughput.  Might blow your budget, though, with the $300 million retail cost.  But you could take solace in how happy you'd make all the investors in 454.

Will anyone be around for a "Cosmological Eschatology"?

Over at Open the Future, Jamais Cascio has compiled a list of 10 Must-Know Concepts for the 21st Century, partially in response to a similar list compiled by George Dvorsky. I'm flattered that Jamais includes "Carlson Curves" on his list, and I'll give one last "harrumph" over the name and then be silent on that point.

Jamais's list is good, and well worth perusing. George Dvorsky's list is interesting, too, and meandering through it got me restarted on a topic I have left fallow for a while, the probability of intelligent life in the universe. More on that in a bit.

I got headed down that road because I had to figure out what the phrase "cosmological eschatology" is supposed to mean. It doesn't return a great number of hits on Google, but high up in the list is an RSS feed from Dvorsky that points to one of his posts with the title "Our non-arbitrary universe". He defines cosmological eschatology through quoting James Gardner:

The ongoing process of biological and technological evolution is sufficiently robust and unbounded that, in the far distant future, a cosmologically extended biosphere could conceivably exert a global influence on the physical state of the cosmos.

That is, you take some standard eschatology and add to it a great deal of optimistic technical development, probably including The Singularity. The notion that sentient life could affect the physical course of the universe as a whole is both striking and optimistic. It requires the assumption that a technological species survives long enough to make it off the home planet permanently, or at least reach out into surrounding space to tinker with matter and information at very deep levels, all of which in turn requires both will and technical wherewithal that has yet to be demonstrated by any species, so far as we know.  And it is by no means obvious that humans, or our descendants, will be around long enough to see such wonders in any case; we don't know how long to expect the human species to last. From the fossil record, the mean species lifetime of terrestrial primates appears to be about 2.5 million years (Tavare, et al, Nature, 2002). This is somewhat less than the expected age of the universe. Even if humans live up to the hype of The Singularity, and in 50 years we all wind up with heavy biological modifications and/or downloaded consciences that provide an escape from the actuarial tables, there is no reason to think any vestige of us or our technological progeny will be around to cause any eschatological effects on the cosmos.

Unless, of course, you think the properties of the universe are tuned to allow for intelligent life, possibly even specifically for human life. Perhaps the universe is here for us to grow up in and, eventually, modify.  This "non-arbitrary universe" is another important thread in the notion of cosmological eschatology.  Dvorsky quotes Freeman Dyson to suggest that there is more to human existence than simple chance:

The more I examine the universe and study the details of its architecture, the more evidence I find that the universe in some sense must have known that we were coming. There are some striking examples in the laws of nuclear physics of numerical accidents that seem to conspire to make the universe habitable.

I read this with some surprise, I have to admit. I don't know exactly what Dyson meant by, "The universe in some sense must have known we were coming." I'm tempted to think that the eminent professor was "in some sense" speaking metaphorically, with a literary sweep of quill rather than a literal sweep of chalk. 

Reading the quotation makes me think back to a conversation I had with Dyson while strolling through Pasadena one evening a few years ago. My car refused to start after dinner, which left us walking a couple of miles back to the Caltech campus. While we navigated the streets by starlight, we explored ideas on the way. Our conversation that evening meandered through a wide range of topics, and at that point we had got onto the likelihood that the Search for Extraterrestrial Intelligence (SETI) would turn up anything. Somewhere between sushi in Oldtown and the Albert Einstein room at the faculty club, Dyson said something that stopped me in my tracks

Which brings me, in a somewhat roundabout way, to my original interest: where else might life arise to be around for any cosmological eschatology? It seems to me that, physics being what it is, and biochemistry being what it is, life should be fairly common in the universe. Alas, the data thus far does not support that conclusion. The standard line in physics is that at large length scales the universe is the same everywhere, and that the same physics is in operation here on Earth as everywhere else, which goes by the name of the Cosmological Principle. More specifically, the notion that we shouldn't treat our little corner of the universe as special is known as the Copernican Principle

So, why does it seem that life is so rare, possibly even unitary? In Enrico Fermi's words, "Where is everybody?"

At the heart of this discussion is the deep problem of how to decide between speculative theory and measurements that are not yet demonstrably – or even claimed to be – complete and thorough. Rough calculations, based in part on seemingly straightforward assumptions, suggest our galaxy should be teeming with life and that technological cultures should be relatively common. But, so far, this is not our experience. Searches for radio signals from deep space have come up empty.

One possibility for our apparent solitude is that spacefaring species, or at least electromagnetically noisy ones, may exist for only short periods of time, or at such a low density they don’t often overlap. Perhaps we happen to be the only such species present in the neighborhood right now. This argument is based on the notion that for events that occur with a low but constant probability, the cumulative odds for those events over time make them a virtual certainty. That is, if there is a low probability in any given window of time for a spacefaring race to emerge, then eventually it will happen. Another way to look at this is that the probability for such events not to happen may be near one, but that over time these probabilities multiply and the product of many such probabilities falls exponentially, which means that the probability of non-occurrence eventually approaches zero.

Even if you disagree with this argument and its assumptions, there is a simple way out, which Dyson introduced me to in just a couple of words.  “We could be first,” he said.

“But we can’t be first,” I responded immediately, without thinking.

“Why not?” asked Dyson. It was this seemingly innocuous question, based on a very reasonable interpretation of the theory, data, and state of our measurement capability, that I had not yet encountered and that provided me such important insight. My revelation that evening had much to do with the surprise that I had been lured into an obvious fallacy about the relationship between what little we can measure well and the conclusions we make based on the resulting data.

Despite looking at a great many star systems using both radio and laser receivers, the results from SETI are negative thus far. The question, “Where is everyone?”, is at the heart of the apparent conflict between estimates of the probability of life in the galaxy and our failure to find any evidence of it. Often now called the Fermi Paradox, a more complete statement is:

The size and age of the universe suggest that many technologically advanced extraterrestrial civilizations ought to exist. However, this belief seems logically inconsistent with the lack of observational evidence to support it. Either the initial assumption is incorrect and technologically advanced intelligent life is much rarer than believed, current observations are incomplete and human beings have not detected other civilizations yet, or search methodologies are flawed and incorrect indicators are being sought.

A corollary of the Fermi Paradox is the Fermi Principal, which states that because we have not yet demonstrably met anyone else, given the apparent overwhelming odds that other intelligent life exists, we must therefore be alone. Quick calculations show that even with slow transportation, say .1 to .8 times the speed of light, a civilization could spread throughout the galaxy in a few hundred million years, a relatively short time scale compared to the age of even our own sun. Thus even the presence of one other spacefaring species out there should have resulted in some sort of signal or artifact being detected by humans. We should expect to overhear a radio transmission, catch sight of an object orbiting a planet or star, or be visited by an exploratory probe.

But while it may be true that even relatively slow interstellar travel could support a diaspora from any given civilization, resulting in outposts derived from an original species, culture, and ecosystem, I find doubtful the notion that this expansion is equivalent to a functioning society, let alone an empire.  Additional technology is required to make a civilization, and an economy, work.

Empires require effective and timely means of communication. Even at the substantially sub-galactic length scales of Earthly empires, governments have always sought, and paid for, the fastest means of finding out what is happening at their far reaches and then sending instructions back the other way to enforce their will; Incan trail runners, fast sailing ships, dispatch riders, the telegraph, radio, and satellites were all sponsored by rulers of the day. Without the ability to take the temperature of far flung settlements – to measure their health and fealty, and most importantly to collect taxes – travel and communication at even light speed could not support the flow of information and influence over characteristic distances between solar systems. Unless individuals are exceptionally long-lived, many generations could pass between a query from the one government to the next, a reply, and any physical response. This is a common theme in science fiction; lose touch with your colonies, and they are likely to go their own way.

So if there are advanced civilizations, where are they? My own version of this particular small corner of the debate is, “Why would they bother to visit?  We’re boring.” A species with the ability to travel, and equally important to communicate, between the stars probably has access to vastly more resources than are present here on Earth. Those species participating in any far-reaching civilization would require faster-than-light technology to maintain ties between distant stars. Present theories of faster than light travel require so-called exotic matter, or negative energy. Not anti-matter, which exists all around us in small quantities and can be produced in the lab, but matter that has properties that can only be understood mathematically. For humans, exotic matter is presently neither in the realm of experiment nor of experiment’s inevitable descendant, technology. 

With all of the above deduction, based on exceptionally little data, we could conclude that we are alone, that we are effectively alone because there isn’t anyone else close enough to talk to, or that galactic civilizations use vastly more sophisticated technology than we have yet developed or imagined. Or, we could just be first. Even though the probabilities suggest we shouldn't be first, it still may be true.

But as you might guess, given our present technological capabilities, I tend toward an alternative conclusion; we could acknowledge our measurements are still very poor, our theory is not yet sufficiently descriptive of the universe, and neither support much in the way of speculation about life elsewhere.

Now I've gone on much too long. There will be more of this in my book, eventually.

Microsoft Supports Biobricks

Last weekend at the 2006 International Genetically Engineered Machines Competition (iGEM 2006), Microsoft announced a Request For Proposals related to Synthetic Biology.  According to the RFP page:

Microsoft invites proposals to identify and address computational challenges in two areas of synthetic biology. The first relates to the re-engineering of natural biological pathways to produce interoperable, composable, standard biological parts. Examples of research topics include, but are not limited to, the specification, simulation, construction, and dissemination of biological components or systems of interacting components. The second area for proposals focuses on tools and information repositories relating to the use of DNA in the fabrication of nanostructures and nanodevices. In both cases, proposals combining computational methods with biological experimentation are seen as particularly valuable.

The total amount to be awarded is $500,000. 

"Smallpox Law Needs Fix"

ScienceNOW Daily News is carrying a short piece on the recommendation by the National Science Advisory Board on Biosecurity (NSABB) to repeal a law that criminalizes synthesis of genomes 85% similar to smallpox.

The original law, which surprised everyone I have ever talked to about this topic, was passed in late 2004 and wasn't written about by the scientific press until March of '05:

The new provision, part of the Intelligence Reform and Terrorism Prevention Act that President George W. Bush signed into law on 17 December 2004, had gone unnoticed even by many bioweapons experts. "It's a fascinating development," says smallpox expert Jonathan Tucker of the Monterey Institute's Center for Nonproliferation Studies in Washington, D.C.

...Virologists zooming in on the bill's small print, meanwhile, cannot agree on what exactly it outlaws. The text defines variola as "a virus that can cause human smallpox or any derivative of the variola major virus that contains more than 85 percent of the gene sequence" of variola major or minor, the two types of smallpox virus. Many poxviruses, including a vaccine strain called vaccinia, have genomes more than 85% identical to variola major, notes Peter Jahrling, who worked with variola at the U.S. Army Medical Research Institute of Infectious Diseases in Fort Detrick, Maryland; an overzealous interpretation "would put a lot of poxvirologists in jail," he says.

According to the news report at ScienceNOW:

Stanford biologist David Relman, who heads NSABB's working group on synthetic genomics, told the board that "the language of the [amendment] allows for multiple interpretations of what is actually covered" and that the 85% sequence stipulation is "arbitrary." Therefore, he said, "we recommend repealing" the amendment.

Relman's group also recommended that the government revamp its select agents list in light of advances in synthetic genomics. These advances make it possible to engineer biological agents that are functionally lethal but genomically different from pathogens on the list. The group's recommendations, which were approved unanimously by the board, are among several that the board will pass on to the U.S. government to help develop policies for the conduct and oversight of biological research that could potentially be misused by terrorists.

DNA Vaccines Update and Avian Flu Tidbits

There has been serious progress recently in developing DNA vaccines for pandemic influenza.  First, Vical just announced (again by press release and conference presentation, rather than peer reviewed publication) single dose protection of mice and ferrets against a lethal challenge with H5N1 using a trivalent DNA vaccine.  Ferrets are seen by many as the best model for rapid testing of vaccines destined for use in humans.  According to the press release:

"We are excited by the recent advances in our pandemic flu vaccinedevelopment program," said Vijay B. Samant, President and Chief Executive Officer of Vical. "Earlier this week, we presented data from mouse studies demonstrating the dose-sparing ability of our Vaxfectin(TM) adjuvant when used with conventional flu vaccines. Today we presented data from ferret studies demonstrating the ability to provide complete protection with a single dose of our Vaxfectin(TM)-formulated avian flu DNA vaccine. Our goal is to advance into human testing with this program as quickly as possible, both to provide a potential defense against a pandemic outbreak and to explore the potential for a seasonal flu vaccine using a similar approach."

Mr. Samant will be attending the bio-era H5N1 Executive Round table in Cambridge in a few weeks, along with Dr. David Nabarro, the Senior UN System Coordinator for Avian and Human Influenza.  I'm looking forward to finally meeting these gentlemen in person.

Powdermed is in early human clinical trials for its annual and pandemic flu DNA vaccines in the U.K. and the U.S., and has recently been acquired by Pfizer.  This should provide needed cash for trials, technical development, and perhaps even for building a manufacturing facility for large scale production of their proprietary needle free injection system.  I think it is interesting that a large pharmaceutical company -- a specialty chemicals company, in essence -- has acquired technology that is essentially a chemical vaccine.  I wonder if Pfizer can lend expertise to packaging and DNA synthesis.

Despite progress in the lab and greater funding, there are still significant challenges in getting these vaccines into the clinic.  Here is the DNA Vaccine Development: Practical Regulatory Aspects slide presentation from the NAIAD.  Obviously, lots of work to do there.  And as I have written about previously, it doesn't appear that the FDA is really interested in allowing new technologies to fairly compete, even if they are the best option for rapid manufacture and deployment as countermeasures for pandemic flu.

In other DNA vaccine news, a recent paper in PNAS demonstrated, "Protective immunity to lethal challenge of the 1918 pandemic influenza virus by vaccination."  Kong, et al., showed that, "Immunization with plasmid expression vectors encoding hemagglutinin (HA) elicited potent CD4 and CD8 cellular responses as well as neutralizing antibodies."  Here is more coverage from Effect Measure, which notes that the paper is primarily interesting as a study of the mechanism of DNA immunization in mice against the 1918 virus.

However, if I understand the paper correctly, the authors developed a means to directly correlate the effect of  immunization with antibody production and thereby, "define [the vaccine's] mechanism of action".  This appears to be a significant step forward in understanding how DNA vaccines work.  I interviewed Vijay Samant of Vical by phone a few months ago, and he noted that because animal studies demonstrate complete protection even though traditional measures of immunity do not predict that result, he has a hunch that "tools for measuring immunogenicity for DNA will need to be different than for measuring protein immunogenicity."  Perhaps the results of Kong, et al., point the way to just such a new tool.

An upcoming Nature paper by Micheal Katze, just down the hill here in the UW Medical School, elucidates some of the mechanisms behind the extraordinary lethality of the 1918 virus in mice.  Writing in Nature, Kash, et al., show that:

...In a comprehensive analysis of the global host response induced by the 1918 influenza virus, that mice infected with the reconstructed 1918 influenza virus displayed an increased and accelerated activation of host immune response genes associated with severe pulmonary pathology.  We found that mice infected with a virus containing all eight genes from the pandemic virus showed marked activation of pro-inflammatory and cell-death pathways by 24 h after infection that remained unabated until death on day 5.

In other words, the immune response to infection with the 1918 virus contributed to mortality.  Moreover, "These results indicated a cooperative interaction between the 1918 influenza genes and show that study of the virulence of the 1918 influenza requires the use of the fully reconstructed virus."  That is, you have to be able to play with the entire reconstructed bug in order to figure out why it is so deadly.  And this result gives an interesting context to the recent paper of Maines, et al., demonstrating that reassortant viruses of the present H5N1 and lesser strains are not as fearsome as the complete H5N1 genome (which I wrote about a few weeks ago).  This latter observation has been interpreted in the press as evidence that H5N1 is "not set for pandemic", even though H5N1 is demonstrably changing in nature primarily by mutation rather than by swapping genes.  H5N1 is quite deadly, and it may simply be that the particular combination of evolving genes in H5N1 gives it that special something.

Finally, an upcoming paper in J. Virology demonstrates an entirely new antiviral strategy based on peptides that bind to HA proteins in vivo and thereby prevent viral binding to host cells.  "Inhibition of influenza virus infection by a novel antiviral peptide," by Jones, et al., at the University of Wisconsin, appears to still be in pre-press.

In the abstract the authors state:

A 20-amino acid peptide (EB) derived from the signal sequence of fibroblast growth factor-4 exhibits broad-spectrum antiviral activity against influenza viruses including the H5N1 subtype in vitro. The EB peptide was protective in vivo even when administered post-infection. Mechanistically, the EB peptide inhibits the attachment to the cellular receptor preventing infection. Further studies demonstrated that the EB peptide specifically binds to the viral hemagglutinin (HA) protein. This novel peptide has potential value as a reagent to study virus attachment and as a future therapeutic.

This is just an initial demonstration, but it is extremely interesting nonetheless.  However, because it is a protein based drug, it risks generating an immune response against the drug itself.  It will have to be administered in a way that preserves function in vivo in humans and doesn't spook the immune system.  The last thing you want to do is generate antibodies against a protein vital for human health.

Yet, precisely because it is a fragment of a human protein, it might mean there is a lower risk of generating that immune response, especially if it can be produced in a way that has all the right post-translational modifications (glycosylation, etc).  Though I wonder about variation in the population: various alleles and SNPs.  What if you are given a version of the peptide that differs in sequence from the one you are carrying around?  Would this generate an immune response against the drug even though it is closely related to something you carry naturally, and if so would those antibodies also pick out your allele?  Definitely the potential for bad juju there.  Another example of where personalized medicine, and having your genome sequence in your file, might be handy.  Alternatively, I suppose you could just use your own sequence for the peptide, and have the thing synthesized in vitro for use as a personalized drug.  Sequence --> DNA synthesis --> in vitro expression --> injection.  Hmmm...you could probably already stuff all that technology in a single box...

However it is used, this advance is probably a very long way from the clinic.  It might go faster if they use the peptide as inspiration for a non-protein drug, which, incidentally, the authors suggest near the end of the paper.  Definitely a high-tech solution, either way, but probably the wave of the future.

Daily Outbreak Forecast

A few days ago, Wired News carried a story by Sean Captain about the Healthmap project, a mash-up of Google Maps and various disease reporting services:

The new Healthmap website digestsinformation from a variety of sources ranging from the World Health Organization to Google News and plots the spread of about 50 diseases on a continually updated global map. It was developed as a side project by two staffers at the Children's Hospital Informatics Program in Boston -- physician John Brownstein and software developer Clark Freifeld.

This follows on Declan Butler's Avian Flu Mashup.  Both efforts encountered significant issues with data formats and parsing the trustworthiness of various data sources.

The Wired News story starts out with this lead: "Web-based maps are handy for keeping tabs on weather and traffic, so why not for disease outbreaks, too?"  And the title is "Get Your Daily Plague Forecast," which. because it is a tad trite, I find rather ironic because a recent PNAS paper demonstrates that, "Plague dynamics are driven by climate variation."

Stenseth, et al., studied the prevalence of Yersinia pestis in the primary host animal, gerbils, as a function of average temperature over 45 years in Central Asia.  They find that ,"A 1°C increase in spring is predicted to lead to a >50% increase in prevalence."  The virus causes bubonic plague in humans, and transmission from rodents to humans is thought to be the main route into the human population.  The authors note in the abstract that:

Climatic conditions favoring plague apparently existed in this region at the onset of the Black Death as well as when the most recent plague pandemic arose in the same region, and they are expected to continue or become more favorable as a result of climate change. Threats of outbreaks may thus be increasing where humans live in close contact with rodents and fleas (or other wildlife) harboring endemic plague.

And as a cheery final note, they conclude that:

Our analyses are in agreement with the hypothesis that the Medieval Black Death and the mid-19th-century plague pandemic might have been triggered by favorable climatic conditions in Central Asia.  Such climatic conditions have recently become more common and whereas regional scenarios suggest a decrease in annual precipitation but with increasing variance, mean spring temperatures are predicted to continue increasing.  Indeed, during the period from the 1940s, plague prevalence has been high in its host-reservoir in Kazakhstan. Effective surveillance and control during the Soviet period resulted in few human cases. But recent changes in the public health systems, linked to a period of political transition in Central Asia, combined with increased plague prevalence in its natural reservoir in the region, forewarn a future of increased risk of human infections.

The combination of climate influences on the prevalence of infectious disease, documented climate change over the last few decades, and the rise of megacities is something we definitely need to watch.

And all this time I was so worried about the flu...

Oh Goody -- Prizes for Genomes!

But seriously folks...it's good news that prizes are being posted for biological technologies.  A couple of weeks ago, the X Prize Foundation announced a $10 million prize for demonstration of "technology that can successfully map 100 human genomes in 10 days."  This is not the first such offer; Nicholas Wade notes in the New York Times that Craig Venter set up a $500,000 prize in 2003 for achieving the Thousand Dollar Genome.  Venter is now on the board of the X Prize Foundation and it appears his original prize has been expanded into the subject of the current announcement.  We definitely need new ways to fund development of biological technologies.

Here's more coverage, by Antonio Regalado in the Wall Street Journal.  It will be interesting to see if anyone can come up with a way to make a profit on the $10 million prize.

The prize requires sequencing roughly 500 billion bases in 10 days.  It isn't possible to directly compare the prize specs with my published numbers since there is no specification on the number of people involved in the project.  If you throw a million lab monkeys running a million low tech sequencers at the problem, you're set.  Except, of course, for all the repeats, inversions, and rearrangements that require expertise to map and sort out.

According to a news story by Erika Check in Nature, the performance numbers cited by 454 Life Sciences appear to be encouraging: "Using the 454 technique, one person using one machine could easily sequence the 3 billion base pairs in the human genome in a hundred days, [Founder and CEO Jonathan Rothberg] says," which is about 3.75 million bases per person per day.  And he is optimistic about progress in reducing costs:  "As the process gets faster, it gets less expensive. "It's clear that we'll be able to do this much cheaper," predicts Rothberg, who says that in the next few years scientists will be able to assemble a human genome for US$10,000."  At the present pace of improvement, this looks to be about 2015, though new technology could always get there sooner.

There seems to be some divergence of expert opinion about where a winning technology will come from.  Writing in Science, Elizabeth Pennisi, notes:

Charles Cantor, chief scientific officer of SEQUENOM Inc. in San Diego, California, predicts only groups already versed in sequencing DNA will have a chance at the prize. Others disagree. "I think it is unlikely" that the winner will come from the genome-sequencing community, says Leroy Hood, who invented the first automated DNA sequencer. And Venter predicts that the chance that someone will come out of the woodwork to scoop up the $10 million is "close to 100%." The starting gun has sounded. 

Indeed.  I had sworn off thinking about new sequencing technologies, but the prize has got even me to thinking...

Avian Flu Catchup, 20 Sept 06.

Here are some comments about the GSK adjuvant announcement, the expansion of vaccine candidates by the WHO, H5N1 evolution in the lab and in the wild, and sequence data sharing.

GlaxoSmithKline announced recently that through the use of a proprietary adjuvant they have dramatically reduced the amount of egg-grown vaccine required to produce a decent antibody response in humans. 

A news story at CIDRAP explains that, "The GSK vaccine was made from an inactivated H5N1 virus collected in Vietnam in 2004, according to Jennifer Armstrong, a GSK spokeswoman in Philadelphia," and then notes that, "It is uncertain, however, how effective the vaccine would be against H5N1 strains other than the one it was made from. [Albert Osterhaus of Erasmus Unversity in the Netherlands] told the AP, "This vaccine will only give protection against this particular H5N1 strain and possibly other strains.""

This last statement may be true, but in my view it may also give false hope.  Aside from criticisms others have raised about GSK announcing science by press release, instead of waiting until a publication is ready, or alternatively just releasing the data, we already know that there are H5N1 variants in the wild that kill humans but don't cross prime immune systems.

In response to this development, the WHO recently advised work begin on vaccines based on clade 2 isolates from Indonesia.  (Here is CIDRAP's take, and here is the original WHO announcement.)  Note that this does not mean we will immediately have vaccines in production against these isolates; as far as I know the reference vaccine is still solely based on the original Vietnamese isolate.

As is fairly widely understood at this point, it is not at all clear that vaccines made from either the Vietnamese or Indonesian isolates will protect humans against potential pandemic strains that arise in nature.  Some effort at discerning the threat from certain potential strains was reported in PNAS in early August.  A news story in Nature describes the results with the headline, "Bird flu not set for pandemic, says US team" (subscription req.).

I find that headline very confusing, because the work in question has very little to do with whether H5N1 is "set for [a] pandemic."  Instead, the research explored the effects on ferrets of a exposure to a small number of recombinant viruses consisting of components from H5N1 and H3N2.  The text following the headline is clearer: "The scientists who conducted the work, at the [CDC], say it suggests that the H5N1 virus will require a complex series of genetic changes to evolve into a pandemic strain...  The study [does not] address whether H5N1 could evolve into a pandemic strain by accumulating mutations."

In fact, only very limited conclusions can be drawn from the paper in question, "Lack of transmission of H5N1 avian-human reassortant influenza viruses in a ferret model" (Mains, et al., PNAS, vol 103, no 32).  The first and last paragraphs of the discussion section show the authors are relatively circumspect in interpreting the data:

If H5N1 viruses acquire the ability to undergo efficient and sustained transmission among humans, a pandemic would be inevitable. An understanding of the molecular and biologic requirements for efficient transmissibility is critical for the early identification of a potential H5N1 pandemic virus and the application of optimal control measures. The results of this study demonstrate, that unlike human H3N2 viruses, avian H5N1 viruses isolated from humans in 1997, 2003, or 2005 lack the ability to transmit efficiently in the ferret model. Furthermore, reassortant viruses bearing 1997 avian H5N1 surface glycoproteins with four or six human virus internal protein genes do not transmit efficiently in ferrets and thus lack the key property that predicts pandemic spread.

Although these findings do not identify the precise genetic determinants responsible for influenza virus transmissibility, they provide an assessment of the risk of an H5N1 pandemic strain emerging through reassortment with a human influenza virus. Our results indicate that, within the context of the viruses used in this study, H5N1 avian-human reassortant viruses did not exhibit properties that would initiate a pandemic. Nevertheless, H5N1 viruses continue to spread geographically, infect a variety of mammals, and evolve rapidly. Therefore, further evaluation of the efficiency of replication and transmissibility of reassortants between contemporary H5N1 viruses and circulating human influenza viruses is an ongoing public health need. The ferret transmission model serves as a valuable tool for this purpose and the identification of molecular and biologic correlates of efficient transmissibility that may be used for early detection of a novel virus with pandemic capability.

It is certainly true that this sort of work is vital for figuring out how influenza works, and in particular vital for trying to sort out how reassortant viruses arise, how they change during passage between animals, and how they kill mammals.  Reassortment was historically important in some flu pandemics.  However, the genetic changes seen in nature in the present H5N1 outbreak appear to be solely due to mutation.  In particular, a cluster of cases in Indonesia in April and May -- the first clear example of human-to-human transmission of H5N1, according to the WHO -- allows tracking sequence changes between viruses that infected eight family members.

In "Family tragedy spotlights flu mutations" (subscription req.), Declan Butler writes that;

Viruses from five of the cases had between one and four mutations each compared with the sequence shared by most of the strains. In the case of the father who is thought to have caught the virus from his son -- a second-generation spread -- there were twenty-one mutations across seven of the eight flu genes. This suggests that the virus was evolving rapidly as it spread from person to person.

[While] many of the genetic changes did not result in the use of different amino acids by the virus...experts say they cannot conclude that the changes aren't significant. "It is interesting that we saw all these mutations in viruses that had gone human-to-human," says one scientist who was present at the Jakarta meeting but did not wish to be named because he was commenting on confidential data. "But I don't think anyone knows enough about the H5N1 genome to say how significant that is."

So there is considerable mutation occurring, even between viruses present in different family members, and we don't yet know enough about H5N1 in humans to say whether this is significant with respect to evolving into a pandemic strain.  But even more interesting, there are so many differences between the viruses that they look like different clades.  Again, from Dr. Butler:

Elodie Ghedin, a genome scientist at the University of Pittsburgh School of Medicine in Pennsylvania, says she's surprised that the virus from the father had so many mutations compared with others in the cluster, apparently arising in just a few days. "I have a hard time believing that the father acquired the virus from his son," she says, adding that the nine mutations in one gene in the father's virus are almost identical to those in viruses isolated from human cases in Thailand and Vietnam in 2004.

One possibility is that the father simply caught a different strain of virus from birds, although other mutations in his virus are similar to those in the strain isolated from his son. Or perhaps the virus from the son reassorted with another flu strain circulating in his father at the time, Ghedin says.

Perhaps, but it would seem that if the father was also carrying a virus from Thailand or Vietnam that there should be signs in birds or other humans.  I was unable to find out whether the father was in a position to pick up a virus from another clade, which would be a good check on the likelihood of reassortment.

Dr. Butler goes on to note that a simple lack of information is a significant factor in the slow progress:

Part of the reason the picture is so unclear, say virologists contacted by Nature, is that the continued withholding of genetic data is hampering study of the virus. None of the sequence data from the Indonesian cluster has been deposited in public databases -- access is restricted to a small network of researchers linked to the WHO and the US Centers for Disease Control and Prevention in Atlanta, Georgia.

Fortunately, this has changed and the Global Initiative on Sharing Avian Influenza Data (GISAID) is now in place.  I'll have something more later on the sharing plan after I digest all the information.  It looks like a nice step forward, but, as always, we'll have to see what comes of it.

Vaccine Development as Foreign Policy

I was fortunate to attend Sci Foo Camp last month, run by O'reilly and Nature, at the Googleplex in Santa Clara.  The camp was full of remarkable people; I definitely felt like a small fish.  (I have a brief contribution to the Nature Podcast from Sci Foo; text, mp3.)  There were a great many big, new ideas floating around during the weekend.  Alas, because the meeting was held under the Chatham House Rule, I cannot share all the cool conversations I had.

However, at the airport on the way to San Jose I bumped into Greg Bear, who also attended Sci Foo, and our chat reminded me of an idea I've been meaning to write about.

In an essay published last year, Synthetic Biology 1.0, I touched briefly on the economic costs of disease as a motivation for developing cheaper drugs.  Building synthetic biological systems to produce those drugs is an excellent example of the potential rewards of improved biological technologies.

But a drug is a response to disease, whereas vaccines are far and away recognized as "the most effective medical intervention" for preventing disease and reducing the cost and impacts of pathogens.  While an inexpensive drug for a disease like malaria would, of course, be a boon to affected countries, drugs do not provide lasting protection.  In contrast, immunization requires less contact with the population to suppress a disease.  Inexpensive and effective vaccines, therefore, would provide even greater human and economic benefit.

How much benefit?  It is extremely hard to measure this sort of thing, because to calculate the economic effect of a disease on any given country you have to find a similar country free of the disease to use as a control.  A report released in 2000 by Harvard and the WHO found that, "malaria slows economic growth in Africa by up to 1.3% each year."  The cumulative effect of that hit to GDP growth is mind-blowing:

...Sub-Saharan Africa's GDP would be up to 32% greater this year if malaria had been eliminated 35 years ago. This would represent up to $100 billion added to sub-Saharan Africa's current GDP of $300 billion. This extra $100 billion would be, by comparison, nearly five times greater than all development aid provided to Africa last year.

The last sentence tells us all we need to know about the value of a malaria vaccine; it could advance the state of the population and economy so far as to swamp the effects of existing foreign aid.  And it would provide a lasting improvement to be built upon by future generations of healthy children.

The economic valuation of vaccines is fraught with uncertainty, but Rappuoli, et al., suggest in Science that if, "policymakers were to include in the calculation the appropriate factors for avoiding disease altogether, the value currently attributed to vaccines would be seen to underestimate their contribution by a factor of 10 to 100."  This is, admittedly, a big uncertainty, but it all lies on the side of underestimation.  And the point is that there is some $20 Billion annually spent on aid, where a fraction of it might be better directed towards western vaccine manufacturers to produce long term solutions.

Vaccine incentives are usually discussed in terms of guaranteeing a certain purchase volume (PDF warning for a long paper here discussing the relevant economics).  But I wonder if we shouldn't re-think government sponsored prizes.  This strategy was recently used in the private sector to great effect and publicity for the X-Prize, and its success had led to considering other applications of the prize incentive structure.

Alas, this isn't generally considered the best way to incentivize vaccine manufacturers.  The Wikipedia entry for "Vaccine" makes only passing reference to prizes for vaccine development.  A 2001 paper in the Bulletin of the World Health Organization, for which a number of experts and pharmaceutical companies were interviewed about ways to improve AIDS vaccine development, concluded, "It was felt that a prize for the development of an AIDS vaccine would have little impact. Pharmaceutical firms were in business to develop and sell products, not to win prizes."

But perhaps the problem is not that prizes are the wrong way to entice Big Pharma, but rather that Big Pharma may not be the right way develop vaccines.  Perhaps we should find a way to encourage a business model that aims to produce a working, safe vaccine at a cost that maximizes profit given the prize value.

So how much would developing a vaccine cost?  According to a recent short article in Nature, funds devoted to developing a malaria vaccine amounted to a whopping measly $65 million in 2003.  The authors go on to to note that, "At current levels, however, if a candidate in phase II clinical trials demonstrated sufficient efficacy, there would be insufficient funding available to proceed to phase III trials."

It may be that The Gates Foundation, a major funder of the malaria work, would step in to provide sufficient funds, but this dependency doesn't strike me as a viable long-term strategy for developing vaccines.  (The Gates Foundation may not be around forever, but we can be certain that infectious disease will.)  Instead, governments, and perhaps large foundations like The Gates, should set aside funds to be paid as a prize.  What size prize?  Of the ~$1-1.5 Billion it supposedly costs to develop a new drug, ~$250 million goes to marketing.  Eliminating the need for marketing with a prize value of $1.5 Billion would provide a reasonable one time windfall, with continued sales providing more profit down the road.

Setting aside as much as $200 million a year would be a small fraction of the U.S. foreign aid budget and would rapidly accumulate into a large cash payout.  Alternatively, we could set it up as a yearly payment to the winning organization.  Spread the $200 million over multiple governments (Europe, Japan, perhaps China), and suddenly it doesn't look so expensive.  In any event, we're talking about a big payoff in both saving lives and improving general quality of life, so a sizable prize is warranted.  I expect $2 Billion is probably the minimum to get international collaborations to seriously compete for the prize.

The foreign policy aspects of this strategy fit perfectly with the goals of the U.S. Department of State to improve national security by reducing poverty abroad.  Here is Gen. Colin Powell, reprinted from Foreign Policy Magazine in 2005 ("No Country Left Behind"):

We see development, democracy, and security as inextricably linked. We recognize that poverty alleviation cannot succeed without sustained economic growth, which requires that policymakers take seriously the challenge of good governance. At the same time, new and often fragile democracies cannot be reliably sustained, and democratic values cannot be spread further, unless we work hard and wisely at economic development. And no nation, no matter how powerful, can assure the safety of its people as long as economic desperation and injustice can mingle with tyranny and fanaticism.

Development is not a "soft" policy issue, but a core national security issue. [emphasis added]  Although we see a link between terrorism and poverty, we do not believe that poverty directly causes terrorism. Few terrorists are poor. The leaders of the September 11 group were all well-educated men, far from the bottom rungs of their societies. Poverty breeds frustration and resentment, which ideological entrepreneurs can turn into support for--or acquiescence to--terrorism, particularly in those countries in which poverty is coupled with a lack of political rights and basic freedoms.

Dr. Condoleezza Rice, in opening remarks to the Senate Foreign Relations Committee (PDF warning) during her confirmation hearings, plainly stated, "...We will strengthen the community of democracies to fight the threats to our common security and alleviate the hopelessness that feeds terror."

Over any time period you might care to examine, it will probably cost vastly less to produce a working malaria vaccine than to continue dribbling out foreign aid.  Even just promoting the prize would bolster the U.S. image abroad in exactly those countries where we are hurting the most, and successful development would have profound consequences for national security through the elimination of human suffering.  Seems like a good bargain.  The longer we wait, the worse it gets.