Those dratted ethnicity percentages
It’s a question that just won’t go away.
No matter how many times The Legal Genealogist warns — and most others interested in genetic genealogy agree — that the ethnicity estimates provided by DNA testing companies are not all that accurate,1 people still want to know:
• What’s the best testing company to find out where in Africa my people came from?
• What’s the best testing company if I think I have Native American or American Indian ancestry and I want to know what tribe?
• What’s the best testing company if I’m English, or German, or Russian, or Italian, or…?
I really do understand the desire to know. I am sitting at an airport waiting to board a flight that will take me to meet a great nephew2 for the first time, and like many of the youngest members of my family he is adding an amazing element of diversity to our family’s ethnic mix. This little boy is contributing Korean genes to our mix, joining a cousin who has brought in African genes and others whose origins are spanning more and more of the globe.
Someday, I surely hope, young Jack will want to know more about his ancestry, and I hope will want to look at the clues hidden in his genes. And, I expect, I will hear the question from him that I hear from other researchers today: “what’s the best testing company for…”
Maybe by the time Jack is old enough to ask the question, and understand the answers, there will be an answer to that question that will satisfy him, and me, and you, and everyone else who wants to know about the evidence of our ancestry that may be found in our genetic makeup.
But the simple fact is, that day isn’t here yet.
We have to keep in mind what these admixture tests do: they take the DNA of living people — us, the test takers — and they compare it to the DNA of other living people — people whose parents and grandparents and, sometimes, even great grandparents all come from one geographic area. Then they try to extrapolate backwards into time. Nobody is out there running around, digging up 500- or 1,000-year-old bones, extracting DNA for us to compare our own DNA to.
So coming up with these percentages in these tests requires this fundamental assumption: that the DNA of the reference populations — those groups whose parents, grandparents, great grandparents and more all come from the same area — is likely to reflect what we might see if we could test the DNA of people who lived in that area hundreds and thousands of years ago.
In other words, these percentages are:
• estimates,
• estimates based on comparisons not to actual historical populations but rather to small groups of people living today, and
• estimates based purely on the statistical odds that those small groups tell us something meaningful about past populations.
In fact, because of these limitations, the very best we can get right now shouldn’t even be called an estimate of the percentage of our genes that come from a specific country. At best it might be called a guesstimate.
The reality is that these percentages are dynamite at the continental level: European versus African versus Asian. And they’re pretty good at identifying descent from populations that are fairly isolated and not mixed with other similar populations. But for the most part they’re really a crap shoot when you try to distinguish, say, English from Irish from Welsh.
These limitations are true of all of the testing companies. I’ve tested with them all, and my own results are — literally and figuratively — all over the map. I’m German with some companies, not German at all with another. Largely Scandinavian with one, only slightly Scandinavian with the others.
Just as one example, 23andMe says I am a little more than four percent Scandinavian. Family Tree DNA says I am about 12 percent Scandinavian. And AncestryDNA says I am more than 30 percent Scandinavian. I have no known Scandinavian ancestors. None at all. So which company is right? Or are they all wrong? I don’t know, and unless they start digging up those old bones, I’m not convinced I’ll ever know for sure.
So what’s the answer to the “what’s the best company” to test with question? It has to be the same answer I give about any DNA testing: test with every company you can afford to test with. You will see differences in the ethnicity estimates from each company you test with, and you can make your own decision about which set of estimates you think most closely approaches the truth.
If one of the three major testing companies detects 0.01 percent Native American and you want to believe it, you will choose that company’s results. If another detects 2.2 percent sub-Saharan African and you want to believe that, you will choose that company’s results.
In the final analysis, because the science is simply not there yet to support percentages at the country or tribal level, we’re all left choosing to accept what we want to believe.
And that’s the best we can do with tests that just aren’t all that good at detecting ethnicity on a country-by-country basis. And I can only repeat what I’ve said before: DNA testing for genealogical purposes is a wonderful tool. But people get disappointed when they see these percentages and they don’t match up to their own paper trail and don’t match up from company to company. And when they get disappointed, they may lose interest in genealogy or in DNA testing. And when they lose interest, we lose out on the paper trail information they might add to our mix.
Bottom line: We need to educate our friends and our families, our DNA cousins, to the limits of what these percentages can show — and to show them all the other things DNA testing really can help with.
Because friends don’t let friends do DNA testing only to get these percentages.
SOURCES
- See e.g. Judy G. Russell, “Admixture: not soup yet,” The Legal Genealogist, posted 18 May 2014 (https://www.legalgenealogist.com/blog : accessed 22 Feb 2015). ↩
- Grand nephew if you prefer. ↩
Actually, some people are digging up old bones and doing DNA analysis, they just tend to be very old bones. And if you want to get an idea of how difficult that process can be, the book Neanderthal Man- in Search of Lost Genomes by Svante Paabo will convince you that most of the sequences obtained of ancient DNA thus far are probably so contaminated as to be useless.
I wasn’t thinking quite that far back in time, Jim!
I feel like this whole thing sort of flies in the face of actual human history. People moved. They moved a whole lot. My Viking ancestors were sharing their DNA in Scotland, in Normandy, in North America…all over. It’s unrealistic to expect DNA to say “Your people were from HERE” when your people weren’t only HERE. They were everywhere.
Plus, I think we want to feel like “I’m Norwegian” means something. Like I’m different from the other humans. But increasingly, DNA is telling us that it doesn’t. Many of us are not who we thought we were, but we’re still who we are. That clan of MacSomethings who thought they were Highlanders but turned out to be Vikings? They’re still the same people. They’re still Highlanders. They’re just Highlanders with a different grandpa 1000 years ago.
That’s a big part of it, Kerry. So many of our ancestors, even distant ones, were mixes too. So there may not ever really be a German or English ancestor anyway.
Couldn’t agree with you more, especially about those Vikings, who were apparently trading and settling everywhere from Russia and the mid-east, through Sicily, Spain, Normandy and the British Isles, all the way across the North Atlantic to Newfoundland. I’ve always thought that ancient peoples did a lot more travelling and exploring than their modern day descendants have given them credit for. Ditto the peoples of the Pacific. Seems to me that human beings have a deep-seated desire to know what’s on the other side of the rainbow and some of them are always going to try to go find out.
http://www.metrolyrics.com/rainbow-connection-lyrics-muppets.html
Like Judy, I also show Scandinavian ancestry despite the fact that as far back as I have been able to track a paper trail thus far both parents are from “Germany,” a term I use loosely. As Kerry and GR point out there is work being done on Ancient bones and our ancestors didn’t just sit around sipping lattes and watching TV. If we are talking deep ancestry we need to look at migration patterns – I have to wonder if Scandinavian tribes that moved into areas where Germanic tribes were are responsible for that percentage.
In thinking about our ethnic identity we have to remember that not only did people move, borders moved, and we cannot assume we are German when your ancestor may have been born in what was Poland, Prussia, a Duchy, etc. – and that is just recent history! Remember Germany did not become a United Germany until 1871. There is a fun little video that you can find online that shows the changing borders of Europe at lightning speed.
If you are interested in ancient history and migration – very interesting but not much good for finding grandpa – read Dienkes’ Anthropology blog.
I am not an anthropologist or geneticist but doing testing has stirred my interest in learning about the possibilities of movement of the ancestors whose names I will never know.
Migration is such an important element in all genealogical research.
And they certainly weren’t shy about … um … sharing their genes when they went wandering either!
Hi Judy – great article, as usual. Are you referencing autosomal DNA only, or all, including mtDNA & y-DNA?
Katherine, you don’t get these percentages with any DNA testing except autosomal. YDNA and mtDNA may point you toward ethnic origins but not as a matter of percentages.
I cringe whenever I see a post anywhere on ethnicity breakdowns. That is not to say that they all start in a bad or misguided place, but they usually end there. I would love to join a genetic genealogy community where those discussions are banned.
I don’t want them banned, just kept in their proper perspective.
This is so perfect I could have written it myself, just not as well. I HAVE posted essentially the same thing in reply to many, many FB posts for several years now, it’s just something people do not want to believe. And, I’d add explicitly what you’ve implied: that nationality and ethnicity are different animals.
Very true… Especially as national borders have been fluid over time, to put it mildly!
Good reminder!
Thanks so much for this concise discussion–autosomal DNA info just sort of goes right over me but now I see that it may be more or less irrelevant in the miniscule percentages!
The very small percentages may be error … and the big ones could be off too.
Can any information be gleaned from a paternity test (from 1991)? Other than who’s the father (I already know that!). Just curious.
Not very likely, no. The type of test done then for paternity is very different from the types done today for genealogy.
Great article. I wish this was made more clear when people purchased their tests but…then they probably wouldn’t sell so many kits in the first place.
That’s the rub, isn’t it? The percentages are such a big selling point.
There are so many aspects to the problem of “ethnicity” that it is hard to decide where to being.
One thing which I want to take up is your statement: “In the final analysis, because the science is simply not there yet to support percentages at the country or tribal level…”
I see variations on this statement throughout websites/forums discussing these DNA tests.
Let me turn the problem around, or perhaps upside down: The problem isn’t “the science” but rather the poorly formed questions people ask.
Customers are bringing to the table so many presumptions that a company, even the most earnest in their goals of scientific validity, cannot provide a good answer.
So I will take you to task on your next statement: “And that’s the best we can do with tests that just aren’t all that good at detecting ethnicity on a country-by-country basis.”
Here the problem is the concept of “country”. Nearly all the countries the average person can name are modern nation states, the borders of many have stabilized only in the last century, and other “countries” entirely created within the last century.
While I see many “ethnicity” promises by companies today as mostly marketing gimmicks, it is very possible to derive ancestral groups from DNA tests. That the actual results are both more subtle and more complex than perhaps many consumers are prepared for is a challenge for a company to communicate to their customers.
Some of it is a problem of over promising — overselling the tests, yes. But as long as we’re comparing living people to living people, I am not at all confident that we know what the data really says, at all.
In the “No good deed goes unpunished” category, I share the link to this article on the Northern Neck of Virginia Genealogy group page, and one woman replied that if “we are going to be such snobs” maybe she would just quit the group. I replied was welcome to leave anytime she wanted. As always, great article and this can’t be told over and over and over often enough.
Sigh…
Thank you for this article. I won’t read into this so much then. I have my autosomal results at FTDNA & Ancestry.com. My Origins from both are completely different:
My Origins at FTDNA:
100% European
Western and Central Europe 65%
Scandinavia 27%
Southern Europe 5%
British Isles 3%
My Origins at Ancestry.com:
Europe 100%
Great Britain 71%
Iberian Peninsula 15%
Europe West 7%
Ireland 2%
Italy/Greece < 1%
That sort of disparity sure proves the point, doesn’t it?
you should see the disparity and my mom’s results
ancestry:
33% native american
21% iberian peninsula
12% great britian
9% italy/greece
10% other european
2% east asia
8% west asia
ftdna:
28% southern europe
21% western europe
2% other european
22% new world
9% east asian
7% jewish diaspora
6% central/south asia
5% african
Boy does THAT ever prove the point!!!
It certainly proves that the cluster constructions are off, but don’t you think both say something interesting about regional variation?
For example, in Carla’s case, both tests show 100% European with variations regionally, with some clear indication that the bulk comes from Northern and Central/Western Europe. In Katy’s mother’s case, the results are similar by region as well” Both high 20s to low 30s Native American, both about 30% European and both at least 10% Asian. Now I’m not saying that these are accurate, but the tests that I’ve run the raw data through on Gedmatch.com vary considerably based on what they use as reference samples, but there’s a regional consistency with that, ancestrydna and ftdna, once you realize that the regional definitions vary by test. To illustrate, the results from ancestrydna are as follows:
Africa < 1%
Trace Regions < 1%
Asia < 1%
Trace Regions < 1%
Europe 96%
Great Britain 78%
Europe West 9%
Trace Regions 9%
European Jewish 3%
Europe East 2%
Ireland 2%
Scandinavia 1%
Italy/Greece 1%
West Asia 2%
Trace Regions 2%
Middle East 1%
Caucasus < 1%
Now from FTDNA:
European: 92%
British Isles: 57%
Scandanavia: 26%
Southern Europe: 8%
Eastern Europe: 1%
Middle East: 7%
Asia Minor 4%
North Africa: 3%
Central/South Asian: 2%
Central Asian: 2%
Compared to confirmed genealogy: About 1/16 can be traced to Germany directly, 1/8 to Southern Italy and the rest to the British Isles. Allowing for some variation with the lower numbers, and keeping in mind that the regional clusters vary between tests and overlap even within tests, it is generally consistent with my known background if you are just looking at broad regional patterns and particularly if you are not confusing regional patterns with ethnicity or nationality. The tests are also very similar to one another because, for example, the definition of Southern Europe is different for FTDNA and ancestry, and same with Middle East etc.
I do wish that they were all more up front with their reference populations and the sample sizes. It would also be nice if we could find some way to sample older deceased populations to determine what kind of variation can be seen when compared to present day populations.
I also agree that the tests just aren’t all that good at detecting ethnicity on a country-by-country basis but there are certain patterns that can be found in people with ancestry from certain regions of the world. Also sometimes there are
1. Native American. It’s a known fact that Latin Americans are a mix of Spanish and Native American and a lot of us that have documents going back several hundred years have records of ancestors that are mentioned as indios, mestizos, and so on and others as españoles. Our Native American ancestry shows up at 23andme, AncestryDNA, and Gedmatch at percentages that are very close to each other and close to the documentation. The only company that shows a large difference is FTDNA. People trying to use 0.01 percent Native American as proof are really reaching.
2. Spaniards at 23andme show up with large levels of Iberian ancestry, 60%-80%, which is pretty much a signature of having ancestry from Iberia. People that don’t have any ancestry from Iberia don’t get that much Iberian at 23andme. Ancestry and FTDNA aren’t as good at determining how much Iberian a person with ancestry from the Spain or Portugal have.
3. Greeks show up with a significant amount of Balkan ancestry at 23andme. A person recently thought that they had Greek ancestry due to a test at FTDNA and Gedmatch calculators. Once that person tested at 23andme and there wasn’t any Balkan ancestry the people helping that person knew he/she wasn’t part Greek.
In summary, 23andme does a really good job, better than the rest, with regions with the Ancestry composition.
You say, “Maybe by the time Jack is old enough to ask the question, and understand the answers, there will be an answer to that question that will satisfy him…” So can you clarify whether it is the current genetic TEST that is not able to provide our answers– or the current pool of DNA samples that are available for comparison that is lacking? Will the actual test that I do in twenty years be better than today’s? Or will the results that I have today be better interpreted in twenty years?
Thanks!
It could be either: a different analysis or a different test. Our DNA of course isn’t going to change, but what we know about it, how we examine it, oh yeah. Right now for the most part what we’re doing is sampling (looking at specific areas and then extrapolating to a larger area or to the whole). We’re capable of doing sequencing (looking at the whole) but it’s expensive now (about $1000 or more). At a minimum I’d expect we’d move to sequencing, and then to deeper understanding of what the sequenced data tells us.
Just to throw in my two cents’ worth, I personally consider myself 100% American. I was born here. Both my parents were born here. All four of my grandparents were born here. All eight of my great-grandparents were born here. How far back do we have to go with all this, anyway?
When we get to my double-great-grandparents yes, we start seeing some born in other countries. Four were Belgian, two are unknown, eight are American, and two are English. Of the unknowns and Americans we do know that the families originally hail from Belgium and Germany, I just don’t have the details, yet. But does this mean I’m 31.25% Belgian? Does the American component have no place in the tally? The whole idea is ridiculous on its face.
When we’re looking at ethnicity, we’re looking at least 500-1000 years ago when there wasn’t an America to speak of and so, no, your American component doesn’t have a place in the tally.
The country didn’t exist yet, but this land did, and it had people here. I think to say you have American ancestry, you should be descended from the people who lived here back then.
Judy,
I think this is a really powerful post. I wanted to either distribute a copy (or just the link) to a group in Miami County, Ohio next week. In light of your recent discussion about copyright, etc., I was wondering how you would feel about a reprint? Otherwise, I’ll just promote the link.
Kathy Reed
I often give permission for reprints, Kathy. Feel free to email me and give me the particulars of your group and I’ll let you know.
Judy,
You gave me permission to distribute this post at a genealogy conference in Piqua. Now I would like permission to distribute it during a talk to medical students at Ohio State University.
Thanks.
Yes, you have my permission, and thanks for asking.
You ladies seem to know what you’re talking about, I have a question. I’m thinking about having the test for ethnicity. I’m 61 yrs young. My father and I are first and second generation born in america. My paternal grandmother was born in Hawaii before it was a state. Her family all migrated from Portugal mainland. My paternal grandfather’s family migrated from the Azore Islands, also Portuguese. If I do this will it tell me European? My father told me that anybody I might meet who oringinated from the Azores we were most likely related to. Plus bloodlines from the moors (and others)who invaded those areas. Will it tell me Spain, Portugal etc or will those places that are so clustered together come under European. My maternal line is supposedly Swedish, English, Irish and maybe Scottish or Scott-Irish. My maternal great aunt, back in the 50’s, had our family tree researched and supposedley we’re descended from William the Conquerer. I’ve been watching “The Vikings” and the series “The Real Vikings”. The other night they said that William was a descendent of the Viking warlord Rollo, who’s portrayed in “The Vikings”. I thought that was so cool! Will it all say Europe or show a mix of nordic and latin bloodlines? Once I find what my blood is I’m going to try to find (with todays tech, compared to my aunt’s time) if they are my ancestors. Thank-you for any given input. Bonney Joseph
There’s no way to know in advance how closely you do or don’t resemble the reference populations used by the testing companies — and it’s your genetic similarity to those populations that let them give you the estimates of your ethnicity. It simply is one of those “you pays your money, you takes your chances” type of things. Right now, the test with the most likely best ethnicity reports is 23and Me, and the $99 Ancestry only option will do.
I agree with the article, but the issue of continent of origin isn’t as cut and dry as it’s made out to be. My results from Ancestry are approximately 90% European and 10% Caucuses. FTDNA are 70% European and 30% Middle East. That’s a pretty big difference. My mother is 100% Italian with very strong Norman strains. My guess is that the Italian is influencing the west Asian / Caucusus readings. Our family ancestry that we can trace on both sides is all European. Isn’t also true that most ancient Europeans migrated from the Middle East?
Family Tree DNA has the oldest repporting systems for ethnicity and is known to overreport Middle East, so I wouldn’t put much stock in that right now. Beyond that, One of the problems is determining the time frame when the ancestors who contributed to your ethnicity may have lived. The ethnicity estimates are much more likely to be looking at markers you received from ancestors who lived 500-1000 years ago as from any ancient migration.
Excellent article. I think I will wait a few years to do genetic genealogy testing.
The value of genetic genealogy testing isn’t these percentages: it’s finding matches to cousins whose research may dovetail with yours and advance your work. For that, delaying makes no sense at all.
Do you think the variations in the DNA tests lie in the actual test, clients not collecting proper samples, labs or even an old test which shelf life has expired? It would be interesting if someone could investigate each of the labs they use. I’m wondering if they’re paid by the number of tests they process each day. Who they employee to run these tests. How they determine the details. I wonder if an investigative reporter did a DNA test through all providers, waited a week, retook the tests with the same companies, this time using a different name, if they tests would all come back exactly matched to the first test. If not, then it could be anything from a bad test, a lazy employee or even the way one employee does something different in the lab when they process the samples.
I think it’s simply a matter that the analysis compares living people to living people and then tries to extrapolate what that means back 500-1000 years.
This article was 1st published in May of 2014. that would mean that the information it contains is over 3 years old. In today’s world of science, that IS damn near ancient.
Actually no. There hasn’t been all that much progress in identifying modern ethnicity based on admixture results in the last several years and good reasons why it may never progress all that much.