Science in the Suburbs, Part II: More from the Personal Genomes Meeting at Cold Spring Harbor

As talks began Saturday at Cold Spring Harbor’s first “Personal Genomes” conference, the first half of which I blogged on here, several leading explorers of the strange new world of “structural variation” in the human genome, such as Evan Eichler and Mike Snyder, shared some of their latest findings.

You can observe the most common kind of structural variation by lining up two corresponding chromosomes; if you notice that one of them is missing a stretch of DNA letters that is present in the other, that’s a deletion (or an insertion from the other chromosome’s viewpoint). Geneticists call these types of variations insertion/deletions, or indels for short.

The main way these variations seem to happen is during the production of sperm and eggs, when an individual’s homologous chromosomes recombine with one another. Usually, chromosome pairs line up perfectly for recombination, but when they don’t, you end up with chromosomes that have a little more or a little less DNA than the originals.

Problems can arise when the stretch of DNA lost in the shuffle was part of an essential gene. Recent work has shown that one form of autism is caused by a deletion, and the same is true for forms of schizophrenia and cystic fibrosis.

Both basic research and clinically-motivated research were presented Saturday. Eichler, of the University of Washington, and Snyder, of Yale, are among those making detailed maps of structural variation in the human genome. The consensus is emerging that there is a *lot* of it. Snyder showed that, at least in the handful of individual genomes he’s looked at so far, any two individuals differ by more than one thousand indels of at least 3,000 DNA letters. Eichler made a similar observation with respect to his comparison between a sample individual’s sequence and the reference human genome, and commented that this means there are at least 3 million DNA letters (3,000 times 1,000) in the sample individual’s sequence not present in the reference genome. That’s on top of the 3 million one-letter SNP variants between any two humans. So it looks like genetic variation is the rule and not the exception, and the very notion of a “reference” sequence may need to stretch a bit in order to stay consistent with the data.

Derek Chiang of the Broad gave a fascinating talk about progress in trying to understand how structural variation can cause medical problems. Chiang, of the Broad Institute of MIT/Harvard, has developed sophisticated software to find structural variants associated with cancerous tissue based on microarray data. His efforts have already yielded a new oncogene for lung cancer. Chiang also described a clever algorithm for finding structural variants from next-generation sequence data that appears to work quite well by comparison to methods based on microarray data. Like the work Elaine Mardis presented Friday, Chiang’s talk suggests a future in which cancerous tumors could easily be distinguished from normal tissue using genetic scans.

The task of understanding what these differences *do* is another, very difficult, question entirely. And being able in turn to develop a therapy tailored to that specific change is yet another challenge.

Sunday’s talks took us into the world of *next next* generation sequencing, as though the prospect of plain old next generation sequencing weren’t already shaking up the field enough. Steve Turner gave a talk on Pacific Biosciences‘ sequencing technology that kept the audience rapt. The technology is ingenious, and solves a number of problems that have bedeviled sequencing since its inception. I will punt on explaining the basics of the technology, since it’s so well-explained on their website.

Turner offered an update that doesn’t appear on the website yet; the technology depends critically on the use of an enzyme, called DNA polymerase, that is responsible for copying DNA. Their team has modified the DNA polymerase used in the machine from the version that exists naturally in humans using a technique called experimental evolution. They generated a collection of mutant polymerases that each differ from the original at random. Then they tried each variant in their machine, retained the ones that do best, and repeated the process. It’s fair to say that horticulture and animal husbandry are slower, less direct forms of experimental evolution. After several generations, they ended up with a polymerase that was better suited to the environment of their machine than natural human polymerase is.

The meeting ended Sunday as Maynard Olson of the University of Washington, an eminent geneticist and one of the architects of the Human Genome Project, closed the conference with a witty and thoughtful summary of the proceedings. Olson suggested that true personalized medicine could be a long time coming, and expressed skepticism that it would arrive at all. One generalization from the talks, he said, was that it appears increasingly that the genetic mutations that impact health are rare, and it may be the case that many diseases can be caused in a large number of ways. The difficulty of working out what each version means could make it hard to work out therapies for so many different possibilities. He predicted a near-term future of unpredictability, especially in guessing which sequencing technologies will come to be adopted the genetics community. He closed by reciting the lyrics to Bob Dylan’s “The Times They Are A-Changin’,” which seemed fitting to me.

Photo: Henryk Kotowski


  • Debbie

    My 23andme results show indels (approx 10 times as many inserts as deletes) on about 200 of my genes, ranging from severe indels to minor.
    The mis-match repair (MMR) genes, which fix mis-matches and thus avoid frameshifts, etc which cause Indels, are the worst Indel mutated genes of all running at rates of MSH2 38%, MLH1 28%, MSH6 26%, and PMS2 7% respectively. And my other indel mutated genes completely cross correlate to the symptoms of my as yet undiagnosed illness.
    For example, the MSH2 gene which 23andme reads 373 base pairs, has 142 indels (131 inserts (II), 10 deletes (DD), and 1 insert/delete (ID), plus 3 no calls (–), making it 38% mutated.
    I am assuming this is not normal and that these genes are mutated or ?damaged?
    I would really appreciate comment on the above interpretation by someone who is familiar with 23andme data as I cannot seem to get a solid answer from professionals who are not familiar with 23andme.
    Thank you in advance.
    Debbie

  • Debbie

    To add to my comment: Debbie on 02 May 2012 at 12:38 am

    I forgot to mention that for most of the severe indel genes/dna the base pairs are almost always the same. Using MSH2 again as an example, it has 228 base pairs that are not indels (142 are); all the base pairs are the same AA, GG, CC, TT, except for 1 base pair rs13425206 which is GT.

Return to top