23andMe Increases Resolution of Chinese Ancestry Inference

23andMe’s latest Ancestry Composition update includes three new populations in China, increasing resolution for v5 customers with ancestry from the world’s most populous country.

 

The five Ancestry Composition population regions within China.
This map indicates the five Ancestry Composition populations within China, including three new, primarily Han Chinese populations. In the upper left corner, Xinjiang Uygur Autonomous Region is hashed with both green and red to reflect the fact that individuals from this region can have genetic ancestry resembling either Central Asian populations or Northern Chinese & Tibetan populations.  

With more than 30 detailed ancestry regions in China and 19 in Taiwan, 23andMe was already the best DNA test for people with Chinese ancestry, but there was room for improvement. 

“I had some specific regional matches in China and Taiwan, but my ancestry breakdown — my pie chart — just said 100 percent Chinese, which was a little disappointing since China has much more diversity than that,” said Alison Kung, 23andMe’s Director of Product Management, whose family hails from Taiwan.

Now, with 23andMe’s latest Ancestry Composition update introducing three new populations in China, Alison’s ancestry percentages have been broken down into a combination of ancestries. In her updated report, most of Alison’s Chinese ancestry has been inferred to be from Southern China, with smaller percentages from elsewhere in China. 

“I’m excited I can finally understand how my genetic makeup is broken up between the regions of China, since I was unfamiliar with my family history before they were in Taiwan,” explained Alison.

View your updated Ancestry Composition report

Note: Customers with no East Asian ancestry may also see small changes to their Ancestry Composition percentages. Common questions about this update can be found at the end of this article.

What’s new?

New populations

In previous versions of Ancestry Composition, there were three populations representing almost all of China’s diverse ancestry: “Chinese” represented the majority Han Chinese population; “Central Asian” represented the Uyghur population of western China; and “Chinese Dai” represented the larger Tai ethnolinguistic group that currently lives in parts of China, Burma, Laos, Vietnam, and Thailand. Most Chinese Dai live in southern and western Yunnan Province and are genetically more similar to their Vietnamese neighbors than they are to the Han Chinese. 

With this latest update, the “Chinese” reference population was replaced by three more specific populations: Northern Chinese & Tibetan, Southern Chinese & Taiwanese, and South Chinese.

 

New stories

Our team of product scientists and content writers have curated a selection of vignettes about the genetic histories, cultures, and attractions unique to the three new populations. Find these new stories on the Northern Chinese & Tibetan, Southern Chinese & Taiwanese, and South Chinese population pages. 

 

The scientific details

A gradient of genetic diversity

More than 90 percent of people in China identify as Han Chinese, but nested within that Han identity are many layers of regional variation. For example, separating the northern Han from the southern Han are vaguely defined, but often deeply felt, geographical, cultural, historical, and linguistic differences. To what extent does their DNA reflect those distinctions?

China’s population history, including extensive migration, complicates the goal of identifying distinct genetic groups in China. A recent study, led by a group at BGI-Shenzhen in Guangdong, China, analyzed the population structure of self-described Han Chinese in China. The authors found a gradient of genetic similarity, but they were also able to identify three distinct genetic groups of Han Chinese, color-coded by region in the paper’s supplemental figure 3, shown below.

 

A northern Han Chinese genetic group (in green), a “central” Han Chinese group (in salmon), and a “South” Han Chinese group (in blue). From Siyang Liu, et al. Cell (2018) “Genomic Analyses from Non-invasive Prenatal Testing Reveal Genetic Associations, Patterns of Viral Infections, and Chinese Population History.”

These results are very similar to those identified by 23andMe’s analysis of customers who report recent ancestry from specific regions of China. While most of the individuals in our analysis may identify as Han, we did not limit the analysis to customers who identified this way. 

View your updated Ancestry Composition report

Discover more with 23andMe

With more than 2,000 geographic regions, mitochondrial and Y-chromosome haplogroups, Ancestry Timeline, advanced DNA comparison, DNA Relatives, and a genetic Family Tree, 23andMe offers more specificity, more interactivity, and more to explore. 23andMe’s Ancestry Composition improves over time as 23andMe adds new populations and regions, offering customers a chance to see more granularity in their results.

23andMe customers can go to their Ancestry Overview to start exploring. 

Not yet a customer? Find out more about 23andMe’s ancestry offerings here.

 

Common Questions

1) How did you identify these populations?

To begin building our reference panel, we selected research-consented 23andMe customers who indicated in their Family Origins survey that all four of their grandparents were born in the same Chinese province.

Next, we identified groups of individuals who share higher levels of DNA with each other than they do with others in the analysis. These genetic groups became our reference groups for this feature. We compare customers’ DNA to the DNA of individuals within these reference groups.

2) Can customers who tested on earlier genotyping chips see this update?

Customers who tested on older versions of the genotyping chip will need to upgrade to the latest genotyping chip to receive updated results. To find your genotyping chip version, go to your 23andMe profile settings, and scroll to the bottom of the personal information section to where it says “Genotyping chip versions.” Customers on our latest chip will see “version 5.” Learn more about chip upgrades. This latest Ancestry Composition update includes three new populations in China and is available to customers who tested with 23andMe using the v5 chip (i.e., those genotyped on our latest genotyping chip, called “v5”). All v5 customers’ Ancestry Composition results will be recalculated as part of this update, including customers without East Asian ancestry. Customers without East Asian ancestry should not expect to see significant changes to their ancestry percentages. 23andMe’s latest Ancestry Composition update includes three new populations in China, increasing resolution for v5 customers with ancestry from the world’s most populous country.

3) What does “Northern Chinese AND Tibetan” ancestry mean? Does this mean I have ancestry from both populations listed in the population name?

This reference population includes individuals whose grandparents were born in Tibet, as well as individuals whose grandparents were born in one of China’s northern provinces. If you have some “Northern Chinese & Tibetan” ancestry in your Ancestry Composition results, this does not necessarily mean you have ancestry from both of these regions, though it is possible. Instead, it could mean that you have ancestry from just one of these regions. You may see more detail in your results that can help shed light on your geographic ancestry — specific provinces to which we were able to trace your ancestry. 

4) My results used to say “East Asian & Indigenous American,” but they no longer do. What changed?

With this update, Indigenous American ancestry is no longer grouped with East Asian ancestry. What does this mean? Previously, people with either East Asian or Indigenous American ancestry would see those ancestries grouped together under an umbrella category called “East Asian & Indigenous American.” These ancestries were grouped together in our Ancestry Composition algorithm to accommodate shared ancestry that resulted from the migrations into the Americas from Asia that began around 23,000 years ago.

However, Indigenous American ancestry is almost always distinguishable from East Asian ancestry and, following recent algorithm improvements reducing the amount of nonspecific ancestry in customers’ results, we’ve separated Indigenous American ancestry into its own category.  While this change better reflects how these populations think about their ancestry today, customers who previously had some “Broadly East Asian & Indigenous American” ancestry may now see some or all of that ancestry identified as “unassigned” by the Ancestry Composition algorithm. Customers with Broadly East Asian & Indigenous American ancestry typically had less than 0.5% of this ancestry.

Read more about how the Ancestry Composition algorithm works.

5) If I have ancestry from an ethnic minority in China, will I also have updated results?

You will still receive updated results, but our algorithms are not granular enough to distinguish these minority populations at this time. It’s likely that the large majority of individuals in the reference populations identify as Han Chinese, but ethnic minorities in China were not excluded in this analysis.

6) I don’t have Chinese ancestry. Will my results change with this update?

It’s possible customers with no Chinese ancestry will see small changes to their Ancestry Composition results. All customers who tested on version 5 of the genotyping chip will be recalculated with this update, though many customers’ percentages may remain the same.