The method behind the Relative Finder tool

This month, 23andMe published a new article in PLoS One entitled Cryptic Distant Relatives are Common in both Isolated and Cosmopolitan Samples.” Authored by former 23andMe scientists, Brenna Henn and Lawrence Hon, along with current 23andMe scientists and 23andMe advisor, Itsik Pe’er, of Columbia University, the article describes the nuts and bolts of the method behind our innovative Relative Finder tool. In this post, Brenna discusses what the findings mean for our notion of “relatives” and explores the many special features of the 23andMe customer database.

By Brenna Henn

You probably know how many siblings you have or the number of first cousins. But do you know how many 2nd cousins you have? Does 38 sound reasonable? What about the number of 3rd, 4th, or 5th cousins?

The answer may surprise you. Under a simple model where a family has 2-3 children, you would have 190 third cousins, 940 fourth cousins and a whopping 4,700 fifth cousins.

Going back to 8th cousins (or 9 generations back to a common ancestor), the model we developed predicts you would have over half a million 8th cousins!


Now, this model is a simplification because some families have a dozen children and some have none; but this model does help illustrate just how many potential distant cousins are out there.

Finding a relative in the 23andMe database, however, rests on two other important conditions. First, the relative has to be a 23andMe customer. Second, you and the relative need to share a long identical segment of DNA. The more distant a pair of cousins, the less identical DNA they share. Very small fragments of DNA can be hard to detect with our algorithm. So although you might have thousands of 6th cousins, we think we can only detect about 4% of them with 23andMe’s current technology.

Our detection success is much higher for more closely related cousins, though — we can detect about 46% of 4th cousin pairs and 90% of 3rd cousin pairs.

Simulated data showing the relationship between shared identical segments of DNA (IBD-half) and # of shared segments for different degrees of relatedness in a population with European ancestry.

Our ability to detect a person’s distant cousins is also influenced by the ancestry of the individual. If you are Ashkenazi Jewish, you may have noticed that 23andMe’s Relative Finder feature shows you over a thousand cousins.

This is because Ashkenazi Jews are more closely related to each other than a random sample of European-Americans. Over the past several hundred years, a cultural tendency to choose marriage partners of the same ethnicity (also known as endogamy) means that Ashkenazi individuals are more likely to share the same ancestors. In fact, we estimate that any two randomly chosen individuals who identify as Ashkenazi are on average the genomic equivalent of 4th-5th cousins, because they share many recent common ancestors. This phenomenon doesn’t just occur in the Ashkenazim.

We looked at 121 populations, many from the 23andMe customer database. Pairs of individuals in Iceland, Finland or South Africa are more closely related than pairs of individuals from Italy or Japan.

Connect with 23andMe members who share DNA and ancestors with you using the Relative Finder tool. Not yet a 23andMe customer? Visit our store or learn more!

In our analyses, we also looked at DNA data for over 5,000 individuals with just European ancestry in the 23andMe database. The DNA indicated that in this sample there are over 5,000 3rd cousin pairs and 30,000 4th cousin pairs. This result is important beyond thinking about genealogy.

It means that large disease association studies that sample thousands of individuals will have many pairs of distant cousins in the dataset. By identifying cryptic (or non-obvious) relatives in databases, researchers can see if certain disease mutations occur more often in different extended families.

  • MJ Bailey

    I’m an only child, my dad one of 3, my mom one of 4. In my main file, there are 12,842 people to whom I’m directly related.

    In a different file, 12, 428–some of these may be duplicates of the above file, but I’m still working on both to extend as far as I can all collateral lines.

    MJ Bailey

  • sammi

    the article above says: “in fact, we estimate that any two randomly chosen individuals who identify as Ashkenazi are on average the genomic equivalent of 4th-5th cousins, because they share many recent common ancestors.” How recent are those recent ancestors? Recent can mean different thinks to diferent people.

  • Ponto

    I come from a Maltese family and one with 8 children. Large families were the norm in Malta up to post recent times.

    I have to say the RF algorithm does not work well for Maltese people as it appears we Maltese people are all related at the 5th to 10th cousin level. A predicted 3rd cousin turned out to be a 6th cousin which was understandable since his Maltese ancestry was back in the 19th century, and his family was Greek born for generations. I have basically learned to tune out the predictions, and just hope my RF cousins have good genealogical records to find the connection. Unfortunately, most don’t have good genealogical records.

    By the way, from genealogy, my parents were 6th cousins once removed. I am sure they would be surprised by that, though not shocked.

  • Peter Meijlink

    Dear Sir/Madam,

    Recently I discovered that my aunt had a match of 13 cM, [shared 0.17] with an American with Dutch ancestry. In her list of matches he was number 33.
    His mother has a match of …cM, [shared0.21], she is number 22 of the list.
    After exchanging genealogical data with the American gentleman, we found out that we shared a common ancestor [Jan jansen Beelen, born 1751 in Hierden in Holland, he died in 1809].
    This ancestor is the grgrgr-grandfather=5 generations of Lydia [85 years] and the grgrgrgrgr-grandfather=7 generations of the American gentleman [43 years].

    I guess and I am afraid that one can NOT (always)use this (wonderful) result as a standard for other cases, am I right?

    Sincerely, Peter Meijlink, Holland

  • KatieR

    I’m one of 5, mother’s one of 7, father’s one of 2. I have 21 1st cousins, and although I haven’t worked out the number of 2nd cousins, I know for sure it’s well over 38! Grandparents were one of 5, 6, 6, 10. One more generation back, they almost all come from families of 10+ kids. This is why I don’t trace descendants. I’d never have the time to get around to tracing ancestors if I did, 😉

  • Diane Runyan

    I’m one of 5 children, my mother was one of four, as was my father. I have 10 first cousins on one side of my family, and don’t know yet about the other. This is a mystery as my mother was adopted at a young age.

    It is encouraging for me to see this prediction of 38 cousins, since my hope is that I find a second, third, or fourth cousin who can collaborate with me on the family history. I also hope that we can talk about some tendencies towards auto-immune diseases which seem to be popping up recently.

    Is anyone noticing any correlation between auto-immune diseases and marriages between distant cousins?

  • Michael

    Would populations that tend to be more closely related have worse health individually, or do those effects only start to show up significantly when first cousins and up are having children?

  • Barbara Balen

    My matches are 998. I would love to be Jewish and celebrate more holidays. But…
    Please discuss the Early American Colonist effect.

  • I heard a few years ago that in the near future females like myself would be able to trace their fathers line somehow, Has that question been answered yet? I do not have any living male relatives on my fathers side of my family and since I am the last living descendent on that line, there is no hope, I have traced his line back into the early 1700’s and on census I am unable to find any straight line male relatives. Can you help me with this? Eleanore Sullivan Kilpatrick

    • Andrew Bruce McDonald

      You can trace your male line if you have a living brother, father, paternal uncle or male child of a paternal cousin. They will all share the same Y chromosome as your dad!

  • Jerry Sexton

    You think this is a mess for sorting out cousins?

    I am the product of successive grandfathers who lived into their 90’s, and had kids into their late 50’s and early 60’s, and then compounded it all by marrying multiple times.
    My 8th generation back was born in 1720, and my 10th generation was born in 1650.
    Even in recent times, my great-grandfather was 59, and his wife was 57, when they had my grandfather. And my grandpa’s oldest brother was already in his 40’s when gramps was born.
    So … when 23andMe uses their predicter to tell me someone is a 5th or 6th cousin, I just smile … they might be 10th cousins if it’s on one of the maternal branches and their own pedigree lines reproduced in a normal time frame.

    I’ve already had to sort this issue out with a number of cousins.
    It can be highly entertaining.

  • Walt

    A woman has a X from her mother & a X from her father. How does the mother’s X know to go back on her mother’s line instead of the father’s X line?

  • Bob Jones

    Sometimes there are large generation gaps. I know a 35 year old women who’s Grandfather died in the 1940s and Great-Grandfather died in 1880 I’m 33 years old, but her great-grandpa is my third-great-grandfather.

  • Julie

    This ‘once removed’ thing confuses me to all heck. I always thought that the child of my 1st cousin would be my 2nd cousin and then child of my 2nd cousin would be my 3rd and so on. I have found a woman who’s Grandfather was 1st cousins with my Grandpa. Our Great Grandma’s were sisters & so then we share the same Great-Great Grandma. Wouldn’t this make us 5th cousins?! I am so confused. Help!!

    • Nicola

      I think the child of your 1st cousin is your 1st cousin once removed. 2nd cousins have great-grandparents in common and 3rd cousins will have Great-Great grandparents in common. So I think the person you have found will be your 3rd cousin. I may be wrong, I find it all a bit confusing too!

    • Andrew Bruce McDonald

      The child of your 1st cousin and your child are 2nd cousins… Their children will be 3rd cousins etc.. The key is counting the number of steps back to the common ancestor. Siblings share a parent (0th cousins), 1st cousins a grandparent, 2nd cousins great-grandparent, etc..

      To be “removed” the shared ancestor is unbalanced—one of the persons is more recently related. The child of your 1st cousin is your 1st cousin 1x removed. You share a grandparent, although to the child it is a great-grandparent, hence the removed bit. The difference between the relationships is the number of times removed. Once removed because from grandparent to great-grandparent is one. If your 1st cousin has a grandchild (as mine does) he is a 1st cousin 2x removed because my grandparent is his great-great-grandparent, a difference of two!

  • Catherine

    When I am searching for relatives you have the option of paternal or maternal line. Is that my maternal and paternal line or theirs? This can help me figure out if they are related to me on my mothers side or fathers. What does it mean when you are related on the x chromosone?

    • ScottH

      That’s your maternal or paternal line. This helps you narrow searches. If you have one or both of your biological parents genotyped with 23andMe, DNA Relatives can automatically determine whether a match is on your mother’s side or your father’s side. This can help narrow your search if you’re looking for a common ancestor with one of your matches. This is telling you if your relative match also shares DNA with your mother or with your father. This most likely means that he or she is on your mother’s side. This is the case for both male and female relatives, since DNA Relatives is based on your autosomal DNA.

  • AdaNja

    Can someone please help me out. If Person X’s great great grand father was a half brother to my great grand father, what is person X? My half 2nd cousin once removed???

    My dad is 1 of 7, Mom is 1 of 10, my dad’s mother was 1 of 5, each having at least 7 kids. My dad’s father was 1 of 20 or more kids. My dad has over 100 1st cousins, more like 150+ or more. So lets just say I prob have about 300 2nd cousins 🙂

    • David and Jeanne Froberg

      Whew! I had to write this one down.

      Let’s say that the common ancestor between you and X is John Smith. Here is X’s line to John: X –> X’sFather –> X’sGrandfather –> X’sGreatgrandfather –> X’sGreatgreatgrandfather –> John Smith (X’s ggg-grandfather). You need the generation *before* X’s gg-grandfather because you are interested in the sibling relationship, therefore you need a parent to the half-brothers. Now here is your line to John: AdaNja –> AdaNja’sFather –> AdaNja’sGrandfather –> AdaNja’sGreatgrandfather –> John Smith (your gg-grandfather, the father of XsGreatgreatgrandfather). Note that the ultimate result — your relationship to X — will be the same if one or more of these generations is female.

      There are a number of relationship calculator charts available on the internet — here are a few:

      The most recent common ancestor between you and X is John Smith, X’s ggg-grandfather and your gg-grandfather, so he is the one you use to calculate your relationship. Starting in the upper left corner of the chart, go across to (X’s) ggg-grandfather, then down to (your) gg-grandfather. Where the two meet is the relationship, but in your case you must halve this, since X’s gg-grandfather and your g-grandfather were half-brothers.

      X is therefore your half-3rd cousin once removed. As my daughter once observed, “Family is complicated!”

  • 10eisha


    I would love a little help! So I have a mystery 1st cousin that shows up: 8.75% DNA shared across 19 segments. I also have a known 1st cousin that shows up at: 13.4% DNA shared across 38 segments.

    You can see that the mystery 1st cousin shares less DNA than the known 1st cousin. So we are trying to figure out if this mystery 1st cousin perhaps only shares one grandparent? For example perhaps my grandfather had a child with someone who was not my grandmother, and that child had a child. Would they still show up as my 1st cousin, just with the less DNA shared?

    Are there any other reasons a 1st cousin would show up with less DNA shared?


    • 23blog

      Hi Tenesiha,
      That difference in shared DNA is actually not that unusual. Although you and your siblings share about 50 percent of your DNA – you get 50 percent of your DNA from each parent – but because of recombination with each successive generation the percentage of DNA you get is a little more irregular. So although it might be around 25 percent that you get from each grandparent, it’s actually a range. So for your first cousins, with whom you share grandparents, you are likely seeing percentages within the range you would expect.

      • 10eisha

        So it is more likely that this person is a full 1st cousin than some other variation? Thank you for the help!

        • 23blog

          Yes, you are correct.

          • 10eisha


  • Andrew Bruce McDonald

    Counting elusive cousins is very difficult. In my family the patterns of children and marriage vary considerable from branch to branch and generation to generation. I don’t think it is feasible over a short period (5 or fewer generations) to make generalizing “average” assumptions as they only become statistically valid with a large enough sample size. Hence, using some arbitrary average number of children is not likely to be very accurate for a given individual to estimate the number of 2nd or 3rd cousins. It may be a good way to look at the average number of cousins over a population.