There are red states and blue states in the United States, but there are also leafy green states and red meat states, according to 23andMe survey data.
We’ve previously written about how traits vary by state, but one of our researchers, Emma Pierson, decided to look a little deeper at those differences.
Emma checked 23andMe’s anonymous survey data against another totally unrelated dataset – the state-by-state patterns of billions of Google searches that anyone can tap into using a tool called Google Correlate. What she found was both fascinating and enlightening and offers another bit of evidence indicating the reliability of the data customers share with 23andMe researchers.
“On a whim I put in 23andMe survey responses to ‘How often do you eat leafy greens?’ and was amazed to find that, out of the billions of possible Google searches, one of the strongest associations was with ‘raw kale,’” Emma said.
Before you shrug and say “Yeah, so?” think about this for a moment: In these totally separate datasets one can see some of the same patterns. So, for example, where we see states with high fractions of morning people also tended to have many people Googling “coffee.” Emma also found states with high fractions of 23andMe customers with coronary artery disease also tended to have high numbers of Google searches for statin drugs.
“There’s a larger point here,” Emma said. “You can have more faith in your data if you find the same patterns in very different datasets.”
And what of 23andMe’s data?
We ask customers all sorts of questions from the one about eating your vegetables to whether your hair is curly to even things like whether you can do a cartwheel. We ask about your health history and drug prescriptions. The answers to those questions and dozens of others like them combined with genetic information give 23andMe researchers tremendous insight into the biology behind human traits and health conditions. In turn, the information can also improve the quality of what 23andMe reports back to customers, offering them another way to engage in their own exploration of genetics.
But this isn’t the traditional way of collecting data for research so we are often asked, can scientists really rely on a bunch of people answering a few questions online to inform them about important research into genetics research?
Yeah, they can.
23andMe has explored this much more rigorously, by comparing the results of genetic analysis based on our online surveys to previous genetic analysis based on data collected through more conventional techniques. So what Emma did was simply put 23andMe’s data to another kind of test.
Given the billions of Google searches and the thousands of 23andMe survey responses, the fact that Emma found these kinds of shared patterns gives us more confidence in the data.
“This is especially important today, when we use big data to map out human lives as never before,” Emma said. “To paraphrase Thoreau, different datasets are merely different perspectives on the same truth: only by combining them can we form the complete picture.”