(Editor’s note: Here is a link to Eoghan’s poster presented at ASHG.)
People often use the terms “genotyping” and “sequencing” interchangeably, but they are quite different ways of approaching genetic data.
One large difference is the amount of data generated. Genotyping, what 23andMe does, tells you your genotype at a small subset of locations in your DNA; full-genome sequencing tells you your genotype at every position. Moving from genotyping to sequencing is on par with replacing a picture of your DNA with one pixel-per-square-inch resolution with a picture with 3,000 pixels-per-square-inch resolution. In short, sequencing means a LOT more information.
As the price of sequencing continues to plummet, we foresee a time when sequencing will be affordable. 23andMe has long lived in the world of genotyping, but anticipating this not-so-distant future, we are exploring how to move into this new world of sequencing. Making this move will require overcoming the many new challenges of dealing with such large amounts of data. Eoghan Harrington, a computational biologist at 23andMe, will be presenting a poster at the American Society of Human Genetics (ASHG) conference next month on how 23andMe has begun to be address some of these challenges in its first sequencing pilot project.
We must walk before we run. Rather than starting with full-genome sequencing, our pilot sequencing study was limited to just the exome portion of DNA, the part of the genome that is translated into proteins. The exome contains about 50 million base pairs, making it a good intermediate size between genotyping about 1 million base pairs and full-genome sequencing at about 3 billion base pairs.
Many of the challenges associated with full-genome sequencing still needed to addressed in this exome sequencing study: the storage and delivery of large (6GB!) files, making sure the data presented was accurate and trustworthy, and giving at least some context to the data returned.
Eoghan and his coworkers were successful in overcoming these challenges. Because all the customers who participated in the sequencing pilot were already genotyped by 23andMe, we could check how often the two methods – genotyping and sequencing – reported the same genotype; the concordance between the two methods was 99.6-99.9%. Perhaps more importantly, the pilot project was successful in the eyes of the participants, many of whom have an extensive background in genetics and gave helpful and positive feedback. These successes will be highlighted in Eoghan’s poster next month at the conference.
23andMe has taken a look into the world of sequencing and has found it be an inviting place. We will take what we’ve learned from this exome experience to prepare for the day when full-genome sequencing will be an affordable possibility for everyone.