GEDmatch opt out is an illusion
Dealing with privacy issues when it comes to DNA databases created for genealogical purposes is like trying to nail jello to a tree.
Sometimes, just when The Legal Genealogist thinks one thing has been nailed down, something we hadn’t thought of pops up to pose more issues.
Sometimes, what’s being said doesn’t match what’s being done.
And sometimes what’s being done turns out to be much different than what anyone might have thought.
Which means that sometimes it’s just a matter of smoke and mirrors.
In this case, the issue is GEDmatch and its newly-announced system for allowing its users to choose between having their matching data accessible to police for criminal and unidentified remains cases and not allowing that result.1
Earlier this week it looked very much as though GEDmatch had switched to a system where an individual user could control the use of his or her uploaded kit for matching purposes when it came to the police. Public kits were divided into two categories: Public + opt in (“available for comparison to any Raw Data in the GEDmatch database”) and Public + opt out (“available for comparison to any Raw Data in the GEDmatch database, except DNA kits identified as being uploaded for Law Enforcement purposes”).2
And GEDmatch explained that: “Comparison results, including your kit number, name (or alias), and email will be displayed for ‘Public’ kits that share DNA with the kit being used to make the comparison, except that kits identified as being uploaded for Law Enforcement purposes will only be matched with kits that have ‘opted-in’.”3
Sounds good, doesn’t it? If you want to allow your DNA to be used for investigations, you opt in. If you don’t, you opt out. Simple, and on the surface looking like it allows all of us real choice — to exercise our own judgment and make our own choices.
Here’s the problem.
It’s smoke and mirrors.
The opt out doesn’t provide any meaningful privacy for uploaded data at all.
That’s because it’s ridiculously easy for the investigator handling a police kit to get access to any matching data of any opted-out kit that the investigator might want. That “kit number, name (or alias), and email” plus DNA segment data and more is going to fully exposed in about 30 seconds, maybe twice that if the investigator is slow.
Here’s how it’s done, and why the opt out status is really just an illusion.
Let’s say person A has uploaded to GEDmatch and is kit ABC123. Person A chooses public + opt in, meaning he chooses to allow police access to his matching information. Person B, who is person A’s third cousin, has also uploaded and is kit DEF456. Person B has chosen public + opt out, meaning he’s chosen not to allow police access to his information.
In comes Investigator Jones with crime-scene kit GHI789 and runs a one-to-many search on that kit. Assume that the bad guy in this case is a third cousin to person A and a first cousin to person B. The one-to-many search on kit GHI789 will show person A’s kit but not person B’s.
And you know what Investigator Jones is going to do, right?
He’s going to take the kit numbers of as many matches to his bad guy’s kit as he can and he’s going to run those kit numbers through that same one-to-many search.
And because Investigator Jones is now running a search on kit ABC123, rather than kit GHI789, all the opt outs vanish.
The minute the kit number being used is not explicitly identified as a police kit, anyone who chose public + opt out and who matches the kit the investigator is now using for his search is as exposed to police use of his data as if he had opted in. And with person B’s info and kit number exposed, it’s child’s play to continue spider-webbing out to matches of matches, then matches of matches of matches.
Person B’s choice — and the choice of all users who opt out — is no choice at all.
I wasn’t 100% sure that GEDmatch didn’t have a plan to prevent that ridiculously simple workaround that utterly destroys the opt-out status of person B’s kit. Perhaps law enforcement users wouldn’t be given access to kit numbers. Or perhaps they’d only have access to the opted-in subset of users. There had to be something, surely, since the GEDmatch Terms of Service and Privacy Policy don’t even suggest that law enforcement shouldn’t use any kit but its own to do searches.
So I asked.
And the official word: GEDmatch has confirmed that there is no legal or technical restriction on law enforcement representatives in using any non-case related kit number to perform other searches that may disclose opted-out users. And — not that it’d be effective anyway — nothing, nothing at all, in GEDmatch’s own rules. And there is no intention of changing that: those who run the GEDmatch site are strongly committed to helping law enforcement use the data there.
No doubt about it.
The public + opt out status is smoke and mirrors.
So… here’s the bottom line.
If you are comfortable with police use of your data for law enforcement investigations, you can choose public + opt in.
If you aren’t comfortable with it or manage a kit for a relative who isn’t comfortable with it or hasn’t consented or can’t consent, then you must choose research kit status to be able to get any use out of GEDmatch for matching purposes.
And if that still leaves you concerned, deleting your data from GEDmatch altogether is another option.
But choosing public + opt out at GEDmatch is … well … not an option.
It’s nothing but smoke and mirrors.
Cite/link to this post: Judy G. Russell, “The choice that really isn’t,” The Legal Genealogist (https://www.legalgenealogist.com/blog : posted 22 May 2019).
SOURCES
- See Judy G. Russell, “GEDmatch reverses course,” The Legal Genealogist, posted 19 May 2019 (https://www.legalgenealogist.com/blog : accessed 22 May 2019). ↩
- “GEDmatch.Com Terms of Service and Privacy Policy,” updated 18 May 2019, GEDmatch.com (https://genesis.gedmatch.com/ : accessed 22 May 2019). ↩
- Ibid. ↩
I have thought about this. Just because kit ABC matches kit DEF, doesn’t mean that kit DEF matches law enforcement kit GHI. Law enforcement can use a non crime kit (ABC) and look at other matches (like anyone can currently), but other information would be needed to establish a match between GHI and DEF. Law enforcement can’t see (and no one can) which kits have chosen to OPT-OUT. So law enforcement wouldn’t know whether GHI doesn’t show up because of OPT-OUT or because they don’t match.
An analogy could be made similar to other public databases, like Facebook. LE may have the name of an associate or relative of a suspect (i.e. opt-in information). Searching Facebook for that person’s friends and associates will provide a large list of people from which familial searching could be done. But as you jump from the associates list to one of the associate’s friend’s list while there may be some overlap, a lot of it not.
I do agree that if someone is concerned about not having any law enforcement access then Research is the way to go. They should bear in mind that no one will contact them because their kit will be invisible to others, so they will be losing part of the advantage of collaboration. Hopefully those people are also not posting on social media or creating trees on Ancestry (even private trees) because some of the search functions will reveal that a tree matches your search criteria even though you can’t see the tree information. I can’t imagine that law enforcement is not already using these data sources.
A quick one-to-one answers the “is this kit relevant to the investigation” question. There simply is NO effective opt-out except for research kit status.
Could you explain how a one-to-one match would provide that information?
I am under the impression that LE kits would not be able to use the One-to-One tool with Opt-out kits. In order to show the match is relevant to the investigation, you would need to triangulate the three kits, but you don’t have segment data between DEF (opt-out) and GHI (LE). Some of the segments may overlap, but you wouldn’t know whether the segment was matched through the same parent.
As far as I can see, anybody can use any tool IF THEY GET THE KIT NUMBER. By spider-webbing it out from the known match (ABC) to ABC’s top matches (including DEF, an opted-out kit now exposed by running one-to-many using ABC as the kit number) and then running a one-to-one between DEF and GHI, you as the investigator have all the info you need.
I’m a little surprised you wrote this entire article without verifying this core piece of information. The fact is that you can’t perform a 1:1 between an LE kit and an opt-out kit regardless of whether you have the kit numbers.
I can only go with what I’m being told since I don’t have an LE kit to use for testing. Even if true, however, LE can run a one to one between ABC and DEF and compare that to the one to one between its kit and ABC. It’s still smoke and mirrors.
Thanks. I just tried to do a one-to-one between a kit and a research kit on another account and the comparison worked. This also partially defeats the purpose of a research kit.
The research kit number is never exposed, however, except by the owner of the research kit. That’s an essential difference.
I hope the clue I need for a DNA breakthrough on my brick wall has not gotten scared and deleted their kit. I’m the 3rd generation working on this couple. My clock is ticking.
I’m so sorry that the co-opting of genealogical databases by law enforcement is impacting your research. None of us wanted that result.
Judy, I created this table to help people understand what can be matched on GEDmatch. The Yellow boxes are the ones that are problematic for informed consent based on my understanding. I welcome any and all suggestions on how to improve this so that it can be clearly understood (link goes to a PDF file on my Google Drive): https://drive.google.com/open?id=1jnq2jjCdAlMwqucaFuN9UiWgMsujrP9g
Judy – we absolutely can NOT run 1:1 reports for our LE kits vs opt-out kits. There is simply no way that your person B kit above can be compared with the LE kit if B has opted out.
You would know better than I if a one-to-one can be run. However, LE vs ABC and ABC vs DEF, plus one to many, and any competent researcher has the answer. Getting DEF’s kit number via the one-to-many for ABC is all that’s needed. The one-to-one would make it easiest, but it’s still easy. Opt out is meaningless.
Judy, it’s very easy as you explained. It’s actually 3 steps only:
1) LE vs ABC
2) take the top matches of ABC (from his 1-to-many list)
3) Enter the top match of ABC (which is the opt-out DEF) vs LE
It’s a joke and unfortunately a sad one. Thank you so much for your articles, I really enjoy reading your blog!
Andreas West: your method would not work. LE cannot do a 1:1 comparison with a kit that has opted out.
Is there any way for us to know if this kit or that kit showing up in our match list is a law enforcement kit? Is the LE kit coded some way so we will know?
I have opted in. There have been multiple murder and other crime victims in my family. I am all for LE catching the criminals using this data.
I think, if a person hasn’t committed a crime, they shouldn’t have anything to worry about. The life we save might be our own by getting the creeps captured and dealt with.
Maybe I’m missing something in this whole debate. I’m good at missing the obvious sometimes.
I do understand the consent issues.
But, I also want to know when I see a kit if it is a law enforcement kit or not.
I’m not aware of any indication on a kit that shows that it’s a LE kit (and can imagine an investigator would have a conniption fit if there was even an outside chance that an investigation could be compromised by revealing that), but perhaps those who work with these kits can chime in.
I seem to remember that one of the Doe search projects stated that they keep all of their kits as Research so they wouldn’t show up on someone else’s match list.
Thank you, Judy.
Ah, yes, the old “if you haven’t done anything wrong, then you have nothing to worry about” theory. Well, then let the police into your house without a warrant any old time they want, and just step out of their way. You’re not a criminal, right?
The Fourth Amendment was created just to avoid such a circumstance. I wonder what the Founding Fathers would think about DNA testing being used as part of police work today.
Suzanne McClendon LE kits are marked as “r” in lower case by GEDmatch. That means that the kit owner cannot change it to public.
They are Constitutional 4th Amendment issues. No one is trying to protect criminals. People are concerned about the protections of innocent people against unconstitutional search and seizures. It’s one of the most fundamental concepts of our country.
I think we are starting to realize that what makes Gedmatch so useful (i.e. its completely open matching system) is not so good for anybody who may be concerned that somebody — anybody, not just LE — may be able to find out that you are related to somebody being researched. Who knows what else the system is being used for by persons unknown? LE is the least of our worries, I would think. LE is at least restrained by a TOS agreement (unenforceable though it may be) — and probably laws too — to not misuse the system. Others who don’t care about abiding by terms of service meanwhile can use the system for any ill purpose that the system CAN be used for, if they just fly under the radar and not let anyone know what they’re doing. It’s a bit like putting your Social Security number out there in an effectively public database and expecting that it won’t be used for anything except what you want it to be used for.
And to make matters worse, even if you don’t have a kit on Gedmatch (or anywhere), enough of your relatives do, such that you can still be identified as being related to someone in particular.
I don’t see how we get around this.+
I wish *enough* of my relatives would get tested, then I wouldn’t have to be sitting here wondering who my father is.
Consider the chilling effect that publicity of these DNA-solved cases is doing to your relatives. It would be one thing to quietly use it, and another completely to shout about it from the rooftops.
An entirely separate database would be a start.
To Andrew Lee: So, LE can see us, but we can’t see them, even if we are opt-in?
GEDmatch is such a wonderful and important resource, given to us absolutely free. John and Curtis provided us a place where we all could come, share, learn and discover. No rules about what we used it for. Why should there be? No promises or expectations of privacy. How could there be?
In those days, we recognized that privacy was our responsibility, not Gedmatch’s.
GEDMatch’s original Terms of Service (ToS) read:
It has always been GEDmatch’s policy to inform users that the database could be used for other uses, as set forth in the Site Policy…. If you are concerned about non-genealogical uses of your DNA, you should not upload your DNA to the database and/or you should remove DNA that has already been uploaded.
GEDmatch site policy:
In today’s world, there are real dangers of identity theft, credit fraud, etc. We try to strike a balance between these conflicting realities and the need to share information with other users. In the end, if you require absolute privacy and security, we must ask that you do not upload your data to GEDmatch. If you already have it here, please delete it.
This DID allow a user to exercise Informed Consent in making his decision. Back then, it was simple. You opted out by not uploading.
After the GSK was identified using the site in April 2018, GEDmatch was persuaded to try to rein in the use of the database by crafting a new ToS that limited Law Enforcement use of the site.
When it was revealed last week that other uses had occurred, GEDMatch was again persuaded to change the ToS to accept data from a wider range of violent crimes, and also to set up an opt-in/opt-out system meant to safeguard user privacy – a system people are now realizing can never be perfect. Subsequent dissatisfaction has caused some genealogists to call for additional changes that could not only exhaust GEDMatch financially, but that could make the system impractically complex and unusable. As we all should understand by now, this would only provide an illusion of privacy, given that ToS in general will always evolve, rely on the honor system and are unenforceable.
GEDmatch has done their best to protect us, when it is unreasonable to expect that they can. They will never be able to give us the security that our data is not exposed to parties we do not want it to be exposed to.
GEDmatch is free. GEDmatch is run by two guys who have done us a great service in providing us with great tools, space, and service. They don’t sell our data as Ancestry and 23andMe do.
In recent months, they have added to that the responsibility of protecting our privacy by responding to what has evolved into expensive and impractical demands even though it is not GEDmatch’s responsibility to do so. If you are a user on GEDMatch, and you are worried about who may have access to your data, then delete it. If you are considering using GEDMatch but you want to be safe, then don’t upload. The choice is yours.
And once you put data in a public place the courts have found through the third party doctrine that you lose the presumption of privacy afforded by the 4th amendment.
Please do not destroy this fabulous resource with ever-increasing demands for privacy that can never be satisfied. Curt and John’s brainchild is a gift to us. Let’s keep it that way. Let’s take responsibility for privacy seriously, but before uploading data to GEDMatch, not after. There is no such thing as privacy online today. The sooner we accept that, the safer we’ll be.
Sorry but I have taken the only true opt-out as you explained it. I have removed my kits. Weighing finding out who a 4x great-grandparent is vs. LE (government) researching my DNA, it is a no-brainer.
It’s part if my internet downsizing. FB was gone when I could click on “why I received this ad” and it told me it was because I had travelled to a certain city.
I realize trying to maintain privacy is a losing proposition but tracking and government review of my DNA isn’t something I am going to make easier by frivolously revealing information.
Andrew Lee said, “I seem to remember that one of the Doe search projects stated that they keep all of their kits as Research so they wouldn’t show up on someone else’s match list.” Yes, I remember that too. I find it very ironic.
Hi Amanda – This is to protect the families. If your child was missing you would not want notification to be via your match list on GEDmatch, or see a brother there and have your hopes renewed he was alive. Adoptee searchers often use Research mode for the same reason. Sensitivity toward the families.
Margaret Press – so it’s OK for those kits to be research, but you all want me and others to make ours open to all, including LE, so you can conduct your missing/DOE/adoptee/LE searches? What about MY privacy?
Sorry, I still find that ironic.
Isn’t the obvious solution to require LE ACCOUNTS be self-identified as belonging to LE and to restrict their access to opt-in kit numbers only?
…and access would include the ability to see opted-out kits in a match list of course.
Exactly the problem: it’s not protecting the opted-out kits at all from that second-level search.
But I would think that if the LE accounts were flagged as such, rather than just the kits, then GEDMatch could eliminate their access to opted-out kit numbers on the input panels and not list them in the reports. It might require a lot of changes in various GEDMatch reports, but that would completely cordon off LE from the rest of GEDMatch users other than those who opt in. (I’m a programmer so I think like a programmer.)
There are all kinds of ways to limit users in one category to a subset of the database records. GEDmatch has made it clear it is not interested in pursuing that option.
CeCe Moore just wrote a rather strong push-back on your interpretation, Judy, which I can’t figure out how to link to. It’s on the ISOGG facebook page. It seems to be the same basic point that Andrew Lee is making above.
She’s entitled to her opinion. I’m not saying the police won’t have to actually do a little more work and use some spreadsheets, but being able to do the second level searches (the searches using the kit numbers of the kits that match the LE kit) eviscerates the suggestion that anyone choosing public+opt out will not be detected by that process. And I will note for the record that this exact language in my post was specifically approved by the GEDmatch operators: “GEDmatch has confirmed that there is no legal or technical restriction on law enforcement representatives in using any non-case related kit number to perform other searches that may disclose opted-out users.”
Scott Wilds, you can’t link it because the ISOGG Facebook page is actually a group (not a page), and it’s a closed group – meaning items cannot be shared out of it. Perhaps Cece Moore will share her thoughts either here or somewhere more public.
And I will add, for the benefit of those who may try to do so, you MAY NOT repost materials from a closed group without the express permission of the poster. I won’t approve them and will delete them if they sneak by the moderation process. The whole purpose of having a closed group is so people can say what they want without fear that it’ll be reproduced anywhere else. Anyone who CHOOSES to comment elsewhere, no problem. But honoring their choice of where to speak out is essential. It’s that whole choice thing again, y’know…
The irony that a group of researchers are arguing for the freedom of information “for the good of the public” in a CLOSED group is not lost on me.
I think GedMatch has gone above and beyond the call of duty with their recent changes. To go further would be to lose the benefit of an extraordinary resource. I opted in. It’s more important to me that my DNA may help LE find and put away a predator than that the whole world may learn about any quirks in my DNA. Let’s not kill the goose that laid the golden egg and let’s not put any more pressure on Curtis and John! I think those two deserve a Nobel.
And now that I think about this some more, I think while you’re trying to figure out all the complicated stuff GEDMatch could or should do to assuage your privacy fears, you may want to consider this: that all that complicated stuff can be defeated by the simple expedient of somebody lying when they upload a file, not identifying it as LE. If I were a cop trying to identify a predator, I wouldn’t concern myself with all the barriers nice, well-intentioned privacy advocates put in place. I’d lie to get the data on, opt in, and there’s nothing GEDMATCH could do to stop me. It’s probably already being done.
Thank You. I have now totally deleted the 25 kits that I manage from GedMatch.
The reason I was late to the DNA game in genealogy is precisely because of the issues now being raised. As a physician, confidentiality was foremost in my thinking about genetic genealogy. Those who want no LE access may kill the entire enterprise. As Judy points out, people basically have two choices. They are NOT opt-in or opt-out for LE but opt-in or opt-out for EVERYTHING. The reason I entered the genetic genealogy arena was because, to me, the benefits outweighed the risks. Those who wish to shackle LE, be careful what you wish for. Genetic genealogy may be a dim glimmer of its former self when all the opt-outs are registered.
One of many reasons I’d sure like to see the genealogical databases separate and apart from the LE databases.
Dr Strumpf, yes, “Genetic genealogy may be a dim glimmer of its former self when all the opt-outs are registered”, especially since Gedmatch’s default is opt-out.
The opt out, such as it is, only affects LE accounts. It does not affect one genealogy kit against other genealogy kits. But the fact that it doesn’t provide the protection an opt out usually does is driving people to make kits research kits, and that does affect genealogy kits.
Concerns about genetic genealogy are not limited to LE. I’m a member of the Board of Trustees of the illinois State Medical Society. I have not taken a position yet, but this resolution, and those at the AMA referenced, may be relevant to this current discussion: https://www.isms.org/Membership/Resolution_Tracking/01-2019-11/
David,
Years ago, GEDmatch tested a group of applications similar to what Promethease offers today. When the FDA put 23andMe on notice, we concluded that any potential benefit to GEDmatch users, was not worth a fight with a Federal agency. We removed all applications aimed at analyzing health issue. Some will argue that health traits can still be determined indirectly, though I disagree. I notice that part of the AMA resolution mentions ethnicity. Does this mean that the Ancestry applications that testing companies, and GEDmatch offer are a target?
John, I’m looking at these issues as a Board member at the Illinois State Medical Society. ISMS has NOT taken a formal position and is still assessing the evidence and reviewing the AMA and other opinions. I would encourage any interested parties to submit comments (after reading the ISMS resolution at the link above). The ISMS assessment is being done by the Medical Legal Council; ISMS; 20 N. Michigan Ave., Ste. 700; Chicago, IL 60602. Fax: 312-782-2023. Email: info@isms.org. I’d enjoy reviewing comments from my genealogy colleagues: genealogy@stumpf.org.
And does anyone want to take the wager that in the next 20 years in the US, newborn babies will be dna tested, instead of just the foot print? Times are changing…….
Absolutely (although I suspect it will take longer than 20 years)! And as I’ve said before, when the change is finished, everyone’s “reasonable expectations of privacy” may very well be different. It’s just those of us living through the change who are dealing with the agony of clashing expectations and desires.
California has been storing blood spots of newborns as early as 1983 (biobank). “California Biobank Stores Every Baby’s DNA; Parents Unaware Of Practice.” https://sanfrancisco.cbslocal.com/2018/05/08/california-biobank-stores-baby-dna-access/
This is worth a watch. Stick to the end. https://anglo-celtic-connections.blogspot.com/2019/05/youtube-y-dna-citizen-science-and-bit.html
Got shut down even asking about this on GGTT so wondered if you could address my question? https://verogen.com/news/ndis-approval-of-miseq-fgx/
Does this mean the FBI won’t need Gedmatch/Genesis?
It will depend on the size of and variety of samples in the database.