Angry at the Genome


Xconomy Boston — 

In 2004, I was an enthusiastic postdoctoral researcher in Eric Lander’s lab at the Broad Institute, with the job I had dreamed of since I was 10 years old. Growing up in Paducah, KY, I read Isaac Asimov’s The Genetic Code. And while I understood nothing of its meaning, I fell in love with the idea of being a human geneticist when I grew up.

I had a particular disease passion that had also been part of the plan since that time: autoimmune genetics. You see, I have a remarkable family. Nearly one-third of my relatives within 3 degrees have an autoimmune disorder. Even at my young age, I somehow knew those weren’t good odds. I knew that “things run in families” and that my family seemed to have autoimmunity in spades. You can imagine my surprise when 20 years afterwards, I realized I was, in fact, a human geneticist in the most renowned tank of genomic thinkers around studying autoimmune disease.

It was a thrilling time to be a geneticist. The human genome sequence was complete. The first thorough map of variation in the genome (single nucleotide polymorphisms or SNPs) was nearly complete. Unconstrained by data to the contrary, it felt like we were turning a corner to truly identify the variation that conferred risk to disease.

But in May of 2004, I began to get very nervous because of an unexpected result we found with one of the most talented teams of autoimmune geneticists in existence: the International Multiple Sclerosis Genetics Consortium. Parenthetically, these folks are absolutely who you want at the front lines of genomic inquiry. They are dogged, thoughtful, and careful about the research they do.

At that time, we were following up on one of the key variants that conferred risk to multiple sclerosis or MS: HLA-DRB1-0201 (or “DR2”). As background, about 40 percent of all patients with MS have the DR2 variation in their genome. By comparison, only 20 percent of the general population has this variant. When you run the statistics, it turns out that this is probably one of the strongest associations in all of autoimmune genetics. So it seemed very reasonable to all of us involved that if we gathered enough patients who had MS and looked separately at the patients with and without DR2, we would expect that we might uncover that there were two types of MS.

To imagine this hypothesis, I visualize genetic “skylines.” While MS may appear to be a “single” disease population based on clinical measures, we hypothesized that the disease resulted from two different genetic skylines. Our experiment was to determine whether if we genotyped everyone and separated out those individuals with the most significant variant, DR2, we would immediately be able to recognize two different landscapes.

Why would this be an important experiment? Our hope was that if a patient had an MS skyline that contained a genetic variant, this might mean they were better served by one drug therapy versus another. Biotech and pharma companies might specifically design clinical studies for therapies targeted at those skylines. Or even better, novel gene associations might reveal themselves when the “noise” of the architecture was reduced. We might find new associations, and these findings would provide novel targets for drug discovery.

However what was found was that no additional gene associations were revealed. No existing associations were stronger in one population compared to the other. In short, the skylines were pretty un-interpretable, with the exception of that previously known variation, DR2.

Most of the team was undaunted by this finding and excited to dig deeper into the genome to understand every additional peak and valley of genetic risk. However, I was devastated. For me, the disease hit very close to home and I was disappointed that there would not be actionable data for some time. So I headed into drug discovery project management, all the while hoping I was wrong and that additional time and hard work by my colleagues would prove me so.

In truth, I didn’t think much about the genome until 2010 at Thanksgiving, when 23andMe offered a $99 deal for a 500K SNP map of my genome. Perhaps surprisingly, even with my family history, I was pretty certain I wouldn’t find anything that might upset me. Why? I was 37 years old, so nearly past the window of onset for many autoimmune diseases. Moreover, my husband and I currently don’t plan on having children, so any untoward variant would be unlikely to inspire worry for the next generation.

I couldn’t have been more wrong. My genome did upset me. Not because I found a variant that is certain to become a major health burden for me in the next 60 years, but rather because I realized there was very little actionable information in the data. In short, I realized that my genome for the most part revealed nothing about my past, current or future health.

I’m not the first person to realize this. Many folks have probably felt the same way when they view their own profiles. I believe my greater frustration is directly related to the insights I’ve gained from my time at the front lines of drug discovery and human genetics. With this special vantage point, I’m now not sure that the architecture of the genome will ever provide guidance for treatment of most diseases (Mendelian genetic disease and oncology aside).

Finding a drug that provides a therapeutic benefit at doses that are much lower than those that cause toxic side effects is probably one of the most challenging jobs on earth. For many years, drug researchers have all been hopeful that the genome would reveal its secrets for some of the tough diseases like MS, and that they’d find drug targets that allowed them to develop precise, targeted, “heat-seeking missile” type therapies, even if for just a subset of all patients. Or even better, that the data would allow pre-identification of folks likely to have disease so that pre-treatment might be possible before irreversible damage occurred.

When I looked at my own genome with the latest in genetic meta-analysis data, I realized I might have entertained the conclusion that I had MS. Now obviously, it wouldn’t be “healthy” for anyone to be on immunosuppressive therapies for 20 years if such treatment was not necessary. In my instance this would have proven to be the case. So we are left with a key question: How much more data (and what kinds of data) would we need to collect to better differentiate the genomes like mine, which (so far) have proved unaffected, while there are similar risk factors in genomes like that of my cousin, who developed the disease at 28 years old?

My fear is that for most complex diseases there are not enough patients on earth (in extant generations) to differentiate fully between individuals who will develop disease and those who will not. In fact, current research suggests that we’ve now sampled enough of the complex genetic-disease patient population to be able to definitively rule out the possibility for many diseases. Moreover, the data suggests that while we may be able to eventually describe all the alleles that confer risk to disease, we will never be able to pinpoint for most patients, even related patients, the precise set of variants that gave them their disease. Or to quote a Boston sage: “we will never be able to differentiate casual from causal” at the level of the patient.

In some ways this is easiest to explain using that original example I gave of DR2. Forty percent of MS patients have a DR2 allele, 20 percent of unaffected individuals have this allele. Clearly DR2 variants increase your risk for disease. However, it is entirely possible that while DR2 is involved in the driving MS disease for some of DR2 positive patients, it may actually play no role in other patients (similar to the 20 percent of the unaffected population who are DR2 positive). It could simply be a case of “true, true, and unrelated.”

To be clear, I would love to be wrong about this. And I hope that the response to this article is that statistical geneticists take up arms to destroy my hypothesis that a futility analysis is likely to be positive for most complex genetic diseases.

But in the near term, I’m banking on the continued determination of my colleagues in drug discovery, who work every day to try to improve efficacy and lessen side effects on their candidate drugs, be it by clever delivery, thoughtful structural drug design, or thorough preclinical and clinical assessments. These folks are some of the hardest working people I know, and the natural architecture of the genome is not making their jobs any easier. I’m also banking on the ingenuity of my colleagues in genomics and proteomics as they collaborate with industry to find a path to bring to bear the genome on drug development insomuch as it is possible today.

By posting a comment, you agree to our terms and conditions.

17 responses to “Angry at the Genome”

  1. Lou Kolb says:

    Congrats on the article !!! Proud of the Paducah Girl. Great work Emily.

    Lou Kolb

  2. mark says:

    Have you dismissed epigenetics? It’s possible that the difference between your profile and that of your cousin is in what’s being expressed and what’s being suppressed.

  3. Emily what a delightful article!


  4. Richard GayleRichard says:

    I think Emily brings up an important topic – the increasing complexities we face in dealing with human disease make it very likely we simply will not find solutions to many of them. Drug development has found solutions for the simplest defects in the system. A similar approach with the remaining ones may not succeed, as Emily so brilliantly describes, no matter how dedicated people are.

    We may have come close to the end of medicine’s ability to make ill people healthy. There will be more successes but they will cost exponentially more than they used to and may provide only marginal increases in health.

    Medicine in the next century may be more focussed on keeping healthy people healthy. This seems to be a ripe area for not only real impact on individuals but on the industry as well. Ubiquitous mobile devices, labs on a chip, bench top sequencers and huge databases from healthy people may change the paradigms that medicine and pharmaceutical development operate under.

  5. DML says:

    Perhaps we need to focus more on understanding the biochemical processes, and less on teasing out statistical correlations.

  6. Nox says:

    This is exactly why I made a deliberate decision to never work on my own disease when I went into biology. There is no objective reason, really, to think that we have come to the limits of what science can illuminate. But when you get mired down in the trenches, when you are dealing with the details of one promising line of inquiry, it’s hard to see the whole picture. When you–or at least I–work on something I have such an emotional investment in, it is harder to be honest with yourself, as honest as you MUST be to make progress in science. The temptation to cling to the hope of a new treatment/understanding is seductive in the lab, even among people who are not close to the problem. I knew I would not deal well with facing this every day, any possible advance tied both to my career and my family’s health.

    I choose to work on diseases other people have. I can try my best to help them without resting all my hopes on the outcome of any experiment that I perform and interpret. As someone in biology I have the technical knowledge to keep abreast of developments in my own disease area and the luxury of allowing myself to hope for them without compromising my lab work.

    What I am trying, awkwardly, to say here is: I think there is still a lot of room for progress to be made. It seems exciting to work on something where you could directly benefit the people you love but for some people, like myself and possibly you, it is better to work on a project where the results will not hit home quite so hard.

  7. Emily WalshEmily Walsh says:

    Great to see all the conversation here!

    As I mentioned in the article, I think we can do the math now with the extant data from whole genome studies to tell us how large the patient population will need to be to draw conclusions about what the genome alone will yield for any single complex genetic disease. And my instinct tells me that we will find that this number is exceedingly large for many diseases.

    The effort of course only gets more complicated for epigenetics and proteomics. With genetics we have the benefit that for the most part (in a patient with a non-oncologic disorder) there is believed to be little difference in the sequence of the genome between cells and over time. So genotyping one cell is representative of the genetic state as a whole.

    However for epigenetics and proteomics, we don’t have that luxury – timing and tissue likely matter. And as we don’t understand for most diseases the cellular/biochemical initiation process, we are setting out on an even more complicated task. Not to mention it may be hard to justify sampling some those disease-central tissues in an otherwise healthy but at-risk patient…

    I think we can agree that with unlimited funding and time, we would want to know everything about what causes human complex genetic disease. The question is given our current economic environment with extremely limited funds, where do we invest in order to have the greatest impact on human health.

  8. The following comment is from Krassen Dimitrov:

    While your genome is an important determinant for your health status, other factors – environmental, lifestyle, even pure stochastics – are overlaid on top of it, so that in most cases the genotype only provides probabilistic information on susceptibilities.

    Phenotypic markers on the other hand – such as gene expression
    signatures, for example – integrate the genetic background signal with those environmental factors, to provide an immediate and relevant representation of a physiological state.

    This has been a bit hard for the genomic-centric types to grasp, but clinicians have intuitively figured it out. As an example, you may know that the FDA has included pharamcogenomic guidance on the
    warfarin label, yet SNP-typing for warfarin has been barely
    prescribed. Why? Cabbage, avocado, onions, mango in your diet have
    more influence on your warfarin/vitK metabolism than your relevant

    There is a lot of hype now surrounding the vision of WGS as a commonly adopted diagnostic test in regular clinical practice. But let me ask you this: if Mayo Clinic cardiologists are reluctant to prescribe simple SNP-typing for warfarin, what are the chances that in the next 10 years or so, when you go for your physical, your GP will say, ‘I think, I’ll send you for whole-genome sequence’? I’d say not very good…

    RNA tests on the other hand, such as OncotypeDx, are different: much more deterministic.

    Finally, on a lighter note, here’s my fave visual for the phenotype vs. genotype dichotomy, the legendary Otto and Ewald:

  9. Vinit Nijhawan says:

    The commentary on the twins brings to mind the ancient Hindu concept of Karma and Dharma. Karma are the cards dealt to you (ie your genetic inheritance) and Dharma is what you do with these cards (gene expression?).

  10. Karsten says:

    Nicely put. I fully agree – it is clear that for most complex diseases a much larger part is heritable than what we can explain from genetic association.

    Brendan Maher suggested a number of explanations in his 2008 Nature News Feature titled “Personal genomes: The case of the missing heritability”.

    While it is frustrating that we cannot explain disease based on genotype alone, this problem also defines a very clear challenge for our future research. The paper of Maher suggests potential avenues.

    Personally, I believe that non-linear interaction between genetic variants play an important role. Evolution pushes systems to adopt stable states to be robuust against external fluctuations, and to switch from one state to another if a certain threshold is passed.

    What we need is a functional and systems-wide understanding of gene-environment interactions. In order to achieve this goal two things are required:

    – Knowledge of the functional genetic variants, which genome-wide association studies deliver(ed), and

    – comprehensive population-based studies that determine many relevant environmental factors together with the genetic variants, intermediate phenotype (ie transcriptomics, proteomics, and metabolomics), and disease outcomes, ideally under a series of challenging conditions. This kind of studies is presently emerging.

    Thanks again for the nice artice … Karsten.

  11. Emily WalshEmily Walsh says:

    Karsten – great reference. Here are another few in that vein that I used during the writing:

    Most complete meta-analysis for autoimmune disorders: “Pervasive sharing of genetic effects in autoimmune disease”. Cotsapas C, et al. PLoS Genet. 2011 Aug;7(8):e1002254.

    Missing Heritability: “Estimating missing heritability for disease from genome-wide association studies.” Lee SH, Wray NR, Goddard ME, Visscher PM. Am J Hum Genet. 2011 Mar 11;88(3):294-305.

  12. Emily WalshEmily Walsh says:

    Karsten – The only thing I would add is that I do think there will at some point we will have much better understanding of the inherited component of complex diseases, however there will always be that pesky uninherited component that will further obscure a risky-but-ultimately-unaffected genome from another. Folks have other ideas on how we may be able to identify those factors (gut biome, etc..).

    But one strong possibility is that this will be a project that takes multiple generations depending on the incidence/prevalence of “architecture” involved. And moreover, at any one time during that project we may observe patients who have the disease in question for a singular reason. Thus finding a properly matched control for a therapeutic intervention study for that person may be impossible.

    That of course leads to the challenges inherit in a single-person-clinical trial, but perhaps that is an entirely different article!

  13. Chad says:

    While I understand the frustration, I do not agree with the conclusion that we have reached the limit of our ability to understand. Your article basically assumes a purely additive effect of mainly large effect genes. But this discounts many other possibilities. The role of Epigentics in human disease has been woefully understudied for years. It could also be the case that a model of strict additivity could be wrong, as argued in a recent PNAS paper from your old mentor’s lab:

    Its only been slightly over a decade since the human genome was sequenced. It has only been more recently that the tools necessary for large scale genotyping and sequencing have become available. I think its jumping the gun to call it all a failure when the technologies have not fully matured. Its like asking a 5 year old why they haven’t got a job yet. Next Gen sequencing technologies have only just now begun to enter the clinical setting with some success:

    So lets not be to quick to jump the gun and call it all a failure.

  14. Ashkan says:

    Great article Emily!