This month, 23andMe published a new article in PLoS One entitled “Cryptic Distant Relatives are Common in both Isolated and Cosmopolitan Samples.” Authored by former 23andMe scientists, Brenna Henn and Lawrence Hon, along with current 23andMe scientists and 23andMe advisor, Itsik Pe’er, of Columbia University, the article describes the nuts and bolts of the method behind our innovative Relative Finder tool. In this post, Brenna discusses what the findings mean for our notion of “relatives” and explores the many special features of the 23andMe customer database.
By Brenna Henn
You probably know how many siblings you have or the number of first cousins. But do you know how many 2nd cousins you have? Does 38 sound reasonable? What about the number of 3rd, 4th, or 5th cousins? The answer may surprise you. Under a simple model where a family has 2-3 children, you would have 190 third cousins, 940 fourth cousins and a whopping 4,700 fifth cousins. Going back to 8th cousins (or 9 generations back to a common ancestor), the model we developed predicts you would have over half a million 8th cousins!
Now, this model is a simplification because some families have a dozen children and some have none; but this model does help illustrate just how many potential distant cousins are out there. Finding a relative in the 23andMe database, however, rests on two other important conditions. First, the relative has to be a 23andMe customer. Second, you and the relative need to share a long identical segment of DNA. The more distant a pair of cousins, the less identical DNA they share. Very small fragments of DNA can be hard to detect with our algorithm. So although you might have thousands of 6th cousins, we think we can only detect about 4% of them with 23andMe’s current technology. Our detection success is much higher for more closely related cousins, though – we can detect about 46% of 4th cousin pairs and 90% of 3rd cousin pairs.
Our ability to detect a person’s distant cousins is also influenced by the ancestry of the individual. If you are Ashkenazi Jewish, you may have noticed that 23andMe’s Relative Finder feature shows you over a thousand cousins. This is because Ashkenazi Jews are more closely related to each other than a random sample of European-Americans. Over the past several hundred years, a cultural tendency to choose marriage partners of the same ethnicity (also known as endogamy) means that Ashkenazi individuals are more likely to share the same ancestors. In fact, we estimate that any two randomly chosen individuals who identify as Ashkenazi are on average the genomic equivalent of 4th-5th cousins, because they share many recent common ancestors.
This phenomenon doesn’t just occur in the Ashkenazim. We looked at 121 populations, many from the 23andMe customer database. Pairs of individuals in Iceland, Finland or South Africa are more closely related than pairs of individuals from Italy or Japan.
In our analyses, we also looked at DNA data for over 5,000 individuals with just European ancestry in the 23andMe database. The DNA indicated that in this sample there are over 5,000 3rd cousin pairs and 30,000 4th cousin pairs. This result is important beyond thinking about genealogy. It means that large disease association studies that sample thousands of individuals will have many pairs of distant cousins in the dataset. By identifying cryptic (or non-obvious) relatives in databases, researchers can see if certain disease mutations occur more often in different extended families.