Select Page

Maximizing autosomal matches

There is a reason why The Legal Genealogist occasionally sounds like a broken record when it comes to autosomal DNA testing.

“Test as many people as you can,” the song goes, before the needle skips and it repeats: “Test as many people as you can.”

Autosomal DNA testing, remember, is the kind of testing you do when you take the Family Finder test at Family Tree DNA, the AncestryDNA test from Ancestry.com or the test from 23andMe.

It’s the test that looks at the DNA in what are called the autosomes, the 22 pairs of chromosomes we all have containing DNA randomly jumbled and passed down equally from our mothers and fathers that helps us locate cousins1 because it contains segments from many different ancestors whose DNA, by chance, managed to survive that jumbling process (called recombination).2

And because of that random jumbling process, even brothers and sisters do not have exactly the same autosomal DNA. One might get more DNA from, say, their paternal grandfather than the others; another might get more DNA from their maternal grandmother. So the only way to find as many matches as possible — and to find the very best matches — is … yep, you’ve got it … “Test as many people as you can.”

Let me show you what that really means, using my own family as an example. Because very little shows why this “test as many people as you can” mantra holds true better than a review of the autosomal results of four members of my family who’ve taken the autosomal DNA test.

matrix.sibs

All four are full-blood siblings, brothers and sisters of my mother, in age order my uncle David, aunt Carol, uncle Mike and aunt Patricia, called Trisha. And the matrix chart you see here shows you the amount of DNA, measured in units called centiMorgans or cM,3 that each shares with the others.

Now the first thing you notice about the chart is the variation between the sibling pairs.

• Mike and Trisha — the two youngest of my grandparents’ 12 children — share the most DNA in common, with David and Carol, the two oldest who tested, coming in a close second.

• David and Trisha, from opposite ends of the age spectrum, share the least in common.

• David and Mike, the brothers, share more DNA in common than Carol and Trisha, the sisters.

And remember: it just happened that way, by random chance. Testing four siblings in your family won’t necessarily produce the same pattern of results.

So what does this tell us? If we’d just tested David and Trisha, we’d have gotten fewer matches in common overall than testing Mike and Trisha. And there’s no way we could have known that before we tested all four.

In other words — repeat after me — “Test as many people as you can.”

And that lesson is really drilled home when you look at the match lists for these four siblings. In total, as of this past week, they have 1,681 matches in the database at Family Tree DNA.

And not even 10% match all four siblings.

Only 159 of the people who match any of my aunts and uncles match all of them. These obviously are the primary cousin targets for research: with autosomal DNA, what we’re looking for more than anything else is patterns and these match-everybody matches may help point the way. And even when you subtract the known relatives who’ve already tested — two nieces (one the daughter of one of the four siblings), one great nephew, one first cousin, three first cousins once removed, two second cousins, two second cousins once removed, and a third cousin, among them — we’ve got some serious candidates to look at.

Not all of them are high priority candidates, of course. The smaller the amount of overall DNA that a match has in common with the siblings, and the smaller the size of the largest shared segment, the farther back in time the common ancestor is likely to be — reducing the odds of ever being able to identify that common ancestor. And — though it’s less likely when an entire family shares a segment with a match — it’s still possible that a small shared segment isn’t what we want it to be, that is, the result of inheritance from a shared ancestor — a phenomenon called identical by descent or IBD.4 Smaller segments could come to us by random chance, called identical by state or IBS.5

But even the high priority, large segment, match-all-four-siblings matches are not the the only ones we need to look at. Another 366 people match three of the siblings, but not the fourth. And there are some really interesting patterns in that group, where, for example, one sibling out of the three will have two or three times as much shared DNA with the match as the others do. Having three siblings match the person tells us this is someone we should look at seriously, but it’s somebody we might have overlooked — or put low on the priority list — if we hadn’t tested the one sibling who shares a lot of DNA with the match.

Another 589 people match two of the siblings, but not the other two, and we see even more interesting patterns. Sometimes both siblings have very small matches with the match, and we might put that match lower on the priority list. But sometimes both have fairly large segments in common with the match, so even though the other two siblings don’t have enough DNA in common even to show up as a match, this may be someone we want to move higher on the priority list.

And then there are the 567 people — fully one-third of all the matches — who match one sibling, and one sibling only:

• David has 130 matches that he doesn’t share with his brother or his sisters.

• Carol has 154 matches that she doesn’t share with either of her brothers or her sister.

• Mike has 120 matches that he doesn’t share with his brother or his sisters.

• Trisha has 163 matches that she doesn’t share with her brothers or her sister.

Now many of those matches are very low priority: the amount of overall DNA shared by the one sibling with the match is small and the largest segment the two share raises a serious risk that the match is IBS and not IBD.

But that’s not true of all of these matches:

• David has four matches projected to be as close as third cousin and 24 projected to be as close as fourth cousin.

• Carol has seven matches projected to be as close as third cousin and 15 projected to be as close as fourth cousin.

• Mike has 16 matches projected to be as close as third cousin and 24 projected to be as close as fourth cousin.

• Trisha has five matches projected to be as close as third cousin and 18 projected to be as close as fourth cousin.

That’s 32 possible third cousins and 81 possible fourth cousins who — by random chance — match only one of the four siblings.

People who could have the answer to one (or more!) of our brick walls.

People who might have that photo of the great grandparents we’ve never seen.

People who might know where the Family Bible is.

People we could have missed by testing fewer than all four of the siblings.

You see where I’m going with this, right?

Test as many people as you can.


SOURCES

Note: There are some corrections above, thanks to Jim Bartlett who pointed out a mistake in the IBS/IBD alphabet soup references.

  1. ISOGG Wiki (http://www.isogg.org/wiki), “Autosomal DNA,” rev. 1 Feb 2014.
  2. ISOGG Wiki (http://www.isogg.org/wiki), “Recombination,” rev. 1 Feb 2014.
  3. ISOGG Wiki (http://www.isogg.org/wiki), “CentiMorgan,” rev. 1 Feb 2014.
  4. ISOGG Wiki (http://www.isogg.org/wiki), “Identical By Descent segment,” rev. 1 Feb 2014.
  5. ISOGG Wiki (http://www.isogg.org/wiki), “Identical by state,” rev. 12 Nov 2013.