DNA Databases, Privacy Concerns, and Noble Cause Bias
by Michael Dean Thompson
Networked Privacy and DNA
Dana Boyd who was one of the first to describe the idea of Networked Privacy has pointed out that choice is not really individual in the network. That is, the choices we make affect our entire network. That likewise means that harm is not individual either as it happens to everyone. Networked privacy is how others expose you. Your individual choices can be diminished or negated when others within your network say things about you that you would rather be kept quiet. We believe privacy to be ours, but it is not ours to control.
DNA is the primary network that binds us. Natalie Ram, a University of Maryland Carey School of Law professor and expert on genetic privacy, told the New York Times, “Our genetic associations are involuntary. They’re profoundly involuntary. They’re involuntary in a way that almost nothing else is. And they’re also immutable.” She went on to say, “I can estrange myself from my family and my siblings and deprive them of information about what I’m doing in my life. And yet their DNA is informative on me.”
Genetic genealogists have found that the typical person has around 850 genetically significant relatives—that is, 850 people in a network that cannot be pruned, but every member node has something to say about every other member. The nodes (individuals) of each network necessarily intersect with the nodes of other networks, sometimes more than once. For that reason, 2018 research led by the former chief science officer at MyHeritage predicted a database of just three million DNA profiles that could identify nearly one hundred percent of Americans of European descent.
CeCe Moore is a genetic genealogist who is also an actor and director and has numerous IMDb credits for roles in which she played versions of herself. In 2012, she attended a panel that included two women of different Native American nations. Moore told the New York Times, “The women explained they wouldn’t take a [DNA profile] test without consulting everyone else in their tribe, because they’d be making the decision for everybody.” The women understood the impact on their network such a choice would have.
Moore, who makes a living using DNA to uncover filial associations, was unconvinced. “It all happened under the radar, and it doesn’t really matter if you’re opposed: It’s a collective decision that’s already been made. A lot of what the privacy advocates have said I agree with. But 30 million people made that choice for everybody else.” Thirty million people have used the genetic genealogy sites such as Ancestry.com, MyHeritage, and GEDmatch, meaning that a genetic genealogist with access to all 30 million profiles could conceivably identify each and every American. That is why Natalie Ram said that genetic genealogy “is fundamentally a search of all of us every time they do it.”
A 2009 Research Council report dismissed most of forensic science as unproven folk wisdom. The report, however, singled out DNA as one of the only forms of forensic science that is actually science. It is then understandable that cops would find it attractive, especially since they can survey large swaths of the country in just a few queries on a computer. Right now, the field is wide open with a hodgepodge of local, state, and federal databases, as well as private databases that create an array of vague and fluctuating controls. Given the power and reach of DNA, its predictive abilities, as well as its ability to identify individuals, knowing just how it works, who is using it, and in what manner it is being used becomes of paramount importance.
DNA Science
With a few exceptions such as people afflicted with Down Syndrome, every person has 23 pairs of chromosomes, half of which are inherited from each of their parents. The DNA is the strand of amino acid pairs that make up each of the 46 typical chromosomes. If the strands of DNA from each chromosome were stretched and lined up with each other, the strand would reach more than six feet while being too thin to be seen even with an optical microscope.
The strand can be envisioned as a spiral staircase where each of the steps is either A (adenine), T (thymine), C (cytosine), and G (guanine) paired as A and T, T and A, C and G, or G and C. Because A and T are always paired, as are C and G, you can break the staircase down the center and refer to only one side to recreate the entire strand. If you encounter the string of ATTG, then you know the other side was TAAC. Interestingly, 99.9% of human DNA is identical across all humanity. Of the 3.2 billion base pairs in human DNA, only about 0.1% of it differs from person to person.
Of the 23 pairs, 22 are non-sex determining, called autosomal chromosomes. The 23rd pair determines the sex of the person: XX for female and XY for male. It is for this reason that the Y chromosome, which can only come from the father, passes drown virtually unchanged along the male line except for very rare mutations.
Mitochondrial DNA (“mtDNA”) differs slightly from the other DNA sources in that it is external to the nucleus of the cell (if you visualize the cell as an egg, the nucleus would be the yoke), within the energy producing section of the cell. Much as the Y chromosome passes down from the father, the mtDNA passes down virtually unchanged along the maternal line.
There are a minimum of 13 loci (a locus is a location of a genetic marker, “loci” is the plural form) required for forensic identification, all of which are derived from autosomal DNA. Each locus consists of two parts (one each from mother and father), so 13 loci is equivalent to 26 pieces of DNA. A locus may be identified as D21S11, where the 21 after the D (DNA) refers to the 21st chromosome. The S indicates there is only one copy of the genetic marker in human DNA. The final number indicates its position on the selected chromosome.
Each marker has a specific Short Tandem Repeat (“STR”), a series of patterns like AATG—the STR for the marker THO1. If tests show that THO1 has AATG, AATG, AATG, AATG, then THO1 is allele-typed as 5, meaning the STR was repeated five times. If it is marked as 9.3, the AATG is repeated nine times followed by three repeats where an A is missing, i.e., ATG.
One more comment on alleles: if both father and mother were to pass THO1 type 5, then the allele is homozygous. If one passes a 5 and the other a 9.3, then the allele is heterozygous. That would be labeled as a 5, 9.3.
The loci used for identification were chosen because they do not express genes—they are non-coding. These areas have been called “junk DNA” in the past. These loci were also selected so that they may not work to indicate distant filial relationships. Yet, these markers are stable (they don’t change over time) and polymorphic (there are many possible allele types). For example, the accepted types by the FBI for THO1 are 5, 6, 7, 8, 9, 9.3, 10, and 11. Each locus with its associated types will have a known list of frequencies within a given forensic database. When the alleles are known, the frequency of their occurrence are multiplied against each other using the product rule. If THO1 is an 8 and has a frequency of 5% (1 in 20 cases), then 0.05 is multiplied against the frequency of occurrence for each of the other markers.
One of the more confusing aspects is within the results of the statistical probability. One of the major methods of computing probability is the Random Match Probability (“RMP”) for a single contributor (the DNA came from a single source) or a major contributor (the DNA came from multiple sources, but this represents the most prominent source). When a confidence interval of 99% against a population of 300 million is required, the actual uniqueness value should be roughly 30 billion. The RMP would then need to be better than 1 in 30 billion in order to make a source attribution. According to the National Institute of Justice (“NIJ”), “We are 99% confident that, in a population of 300 million unrelated individuals, the STR DNA profile observed would occur only once in (i.e., it is unique).” Notice the unrelated individuals caveat; it does not consider twins or odds in areas where large numbers of related persons live (for example, Amish territories).
On occasion, a forensic team will rely on a Y STR profile. This might happen in a sexual assault where the autosomal DNA cannot be separated from the female victim’s DNA. As there are some markers that are found only on the Y chromosome, a profile of those markers will identify the male line in the attacker. Likewise, there are situations where a test of mtDNA might also be used. Note that for Y STR and mtDNA, the product rule is not used.
The best possible DNA profile would come from a single source and lack chemical contaminants. However, there may often be mixtures of DNA. The challenge of DNA mixtures, particularly with regard to mixtures containing more than two profiles, is identifying each donor’s alleles, which can accumulate. Nevertheless, if one contributor in the mixture has six times the amount of genetic material as a second contributor, it will likely be possible to select some of the alleles that belong only to the major contributor. According to the NIJ’s DNA for the Defense Bar, “When a mixture contains the DNA of three or more people—especially when all of the contributors are unknown and there is no clear major contributor to the mixture—teasing it apart into individual contributors is extremely challenging and may be impossible.”
Advances in DNA multiplication capabilities have led to the ability to derive what has been called “Touch DNA,” where very small quantities of DNA are extracted from an object that someone has touched. Yet, that also leads to a chance that a DNA sample has been contaminated by latent DNA from another source, such as from crime scene handling. There is a challenge to the earlier assertion that the loci involved in identification are actually non-coding that was given earlier.
A peer-reviewed paper in 2018 in the journal Cell is referenced in the article “Spit and Acquit: Prosecutors as Surveillance Entrepreneurs,” 107 Calif. L. Rev. 405, by Andrea Roth. She says the Cell article pointed out that due to an overlap of genetic markers, STR and SNP markers, they could predict the profile on a genetic genealogy website via SNP markers extracted from CODIS STR markers. The capacity to derive at least some phenotypic markers (such as eye color, disease risk profiles, or some other coded phenotype) from what had been believed non-coding DNA could have far-reaching civil rights implications.
FBI, FDDU, CODIS, and NDIS
The Federal DNA Database Unit (“FDDU”) is tasked with aiding DNA hit confirmations against people with profiles in the Nation DNA Index System (“NDIS”), which is part of the Combined DNA Index System (“CODIS”). In 2009, Douglas R. Hares, who was the custodian of NDIS and Acting Unit Chief of the FBI’s CODIS, delineated the two in Rivera v. Mueller, 596 F. Supp. 2d 1163, 1166 (N.D. Ill. 2009). He explained that the “purpose of NDIS is to generate leads for the law enforcement community.” NDIS contains the DNA profiles that were contributed by accredited laboratories and federal, state, and local criminal justice agencies. Included in the list of laboratories are both public and private labs that meet certain quality and audit criteria.
CODIS, on the other hand, “is a software system developed to facilitate the flow of DNA information from state and local laboratories,” according to Hares. Additionally, CODIS “automatically runs a search … to identify potential matches between biological samples found at crime scenes and offenders who are already in the NDIS system.”
In 1994, the FBI was already in the process of creating a system for the storage and access of DNA records when Congress passed the DNA Identification Act of 1994. The Act provided the funding needed while also setting up specific requirements, such as Quality Assurance Standards. It additionally provided funding for CODIS and grants to the state and local laboratories that would contribute to it.
The NDIS Operational Procedures Manual identifies five primary indexes into which records are stored, two of which are further partitioned into related indexes. The primary indexes are the Forensic Index (also includes the Forensic Mixture Index and Forensic Partial Index), Offender Index (Convicted Offender Index, Arrestee Index, Detainee Index, Legal Index, and the Multi-Allele Index), Missing Person Index, Unidentified Remains Index, and the Relatives of Missing Persons and Pedigree Tree Index.
Notably, only DNA submitted to the Unidentified Human Remains Index can initiate a search across all other indexes. Likewise, DNA submitted against any other index will be checked against it. Only the Relatives of Missing Persons and Pedigree Tree Indexes is restricted to searching just one index—Unidentified Human Remains. A person who submits their DNA to help determine the identity of human remains will not have their DNA checked against the forensic or offender profiles.
CODIS originally only tracked violent and sexual offenders but has since expanded under federal law. The Justice for All Act of 2004 was the first of several expansions so that people can now be added to CODIS simply for being arrested. Harvard researcher Anna Lewis told the Intercept, “The ACLU warned that this was going to be a slippery slope, and that’s what we’ve seen.” Under President Trump, migrants arrested or detained by Immigration and Customs Enforcement (“ICE”) were required to submit their DNA for CODIS, leading to an explosion of profiles and an associated backlog of submissions and an associated backlog of 650,000 samples. President Biden has not reversed the decision.
The scale of the growth has been stunning. In July of 2015, the FDDU announced its millionth DNA profile. By August 29, 2023, the number of DNA profiles within CODIS ballooned to 21.7 million (roughly equivalent to 7% of the population of the U.S.). In April of 2023, FBI Director Christopher Wray told Congress that the FBI had been collecting around 90,000 profiles a month, “over ten times the historical sample volume.” He explained that his request to nearly double the budget from $56.7 billion to $109.8 billion for 2024 was to address the expectation they would be required to process on the order of 120,000 samples a month or around 1.5 million over the year.
Vera Eidelman, a staff attorney at the ACLU who specializes in genetic privacy, told The Intercept, “When surveillance technology gets cheaper, easier, and faster to use, it tends to get used more—and often in ways that were troubling.” And that is precisely what is happening. Modern DNA analysis is remarkably simple. With just a swab of the inner cheek (a buccal swab), the sample can be processed within two hours without human involvement. The FBI has leaned heavily into DNA, calling it “one of the most successful investigative tools available to U.S. law enforcement.”
To provide a sense of the scope of this issue, a 2020 study cited by the New York Times said that China announced a plan to acquire the DNA of between 5% to 10% of its male population. Meanwhile, the U.S. intelligence community has warned of China’s potential collection of the DNA of foreigners. And last year, the U.S. had amassed more than 21 million DNA samples in CODIS alone, a significant amount of it from immigrants who will likely be returned to their homeland. For now, it appears to be a sad race the U.S. is winning. “When we’re talking about rapid expansion like this, it’s getting us ever closer to a universal DNA database,” Eidelman warned, adding, “I think the civil liberties implications here are significant.”
The Proliferation of DNA Databases
The expansion of DNA databases within the federal system is impressive (and alarming), but feds are not alone. States and even cities have created their own systems. Given the breadth of exposure of the federal CODIS, it would seem the only reason these systems exist would be to bypass CODIS requirements. In fact, cops hunger for DNA data that cannot be accessed through government systems is now being met by corporate entities as well. One corporation was quoted by Erin E. Murphy in Inside the Cell: The Dark Side of Forensic DNA as advertising:
“‘SmallPond may be for you:’ if you are currently using or considering the use of CODIS for your DNA databasing needs, but are concerned about: the long backlogs at public state and regional DNA labs with CODIS access[,] the profile entry restrictions imposed by CODIS[,] the lack of local criminal elements in the CODIS database[,] the costs and maintenance headaches of the required hardware/network infrastructure[, and] the costs and time of training and certification required[.]”
Bypassing “profile entry restrictions” in the above advertisement really just means submitting DNA from any source, regardless of from whom it’s acquired or how or the type of crime. Likewise, by bypassing the hardware and network requirements, they are minimizing the genetic privacy concerns of the people they are surveilling. The argument that CODIS lacks a criminal element is a misleading concern common among cops. Because CODIS historically focused only on people convicted of violent or sexual felonies (it no longer requires a conviction or necessarily a violent felony), cops worried CODIS only served to identify people already in prison and would therefore miss the people they were trying to convict. The implication, then, is that they want access to the DNA of people who have never been convicted—and as we will soon see—and have never even been arrested. Finally, it should be apparent that having untrained cops submitting DNA samples from unproven sources over potentially exposed networks should worry anyone with some awareness of civil rights, yet that is exactly what this company is advertising.
The private databases represent an extreme end. California’s own DNA DataBank, which was developed in spite of the FBI’s extant system, is therefore somewhere in the center between the feds and private systems. Since 2009, every person arrested for a felony has been required to submit to a DNA profile.
Yet, that was not enough for the San Francisco Police Department. In February of 2022, it was discovered that the DNA extracted from a woman who had been raped years earlier was used to support a conviction. She was the suspect in a retail theft, and it was her DNA they wanted. The FBI’s CODIS carefully segregates the two, and victim DNA is not stored. As such, it could not be used. Neither would the state DNA database serve their purposes if she had never been arrested for a felony.
Nevertheless, when the story broke of the rape victim’s DNA being used to convict her, it was roundly condemned by the police chief, state and local politicians, the city’s district attorney, and advocacy groups. For the police chief, DA, and others in law enforcement, the condemnations must have rung hollow as it turns out the practice was considered by some in the DA’s office to be “standard operating procedure in the field.” It should have been clear to any prosecutor that prosecuting a rape victim from the DNA she cannot help but submit in a rape kit would give almost any woman pause in reporting sexual violence against them. Furthermore, the woman’s DNA was used to prosecute a retail theft case, not a violent felony like the sexual assault she suffered. A rape victim’s pause is especially prudent when the same rape kit sample might be used to convict her son, father, or second cousin. A woman in Louisiana who had also been raped provided her DNA. Only, rather than catch her rapist, the cops used her DNA to convict her brother of a string of crimes.
The Orange County District Attorney’s Office has built its own DNA database program. Known as “Spit and Acquit,” the program has offered pleas and dismissals since 2007 for people facing misdemeanors in exchange for a sample of their DNA. As of 2019, 150,000 people have agreed to “Spit and Acquit.” According to Andrea Roth, nearly every misdemeanor in Orange County is now conditioned on providing DNA. The idea of catching the criminal before a serious crime has been committed is not new. California’s cops have been tapping California’s biobank, a storage of DNA samples of every baby born in the state, essentially making every baby born in the state an unwitting investigative lead in crimes that have not yet been committed. It is important to note that “Spit and Acquit” is not stored in California’s DNA DataBank. California limits submissions to its DNA DataBank to people arrested for felonies and either violent sex-related or death-related misdemeanors.
Unlike the local and private databases, California’s DNA DataBank (CAL-DNA DataBank) under Prop69 does allow for expungement of records. Some of the arrestees will never be charged with an actual crime, and a few others will be acquitted. Consequently, there should be a path to remove the DNA record as there is with CODIS. But expungement comes with caveats that make success incredibly unlikely. No person with a previous felony may even apply. If a person is arrested but not charged, they can only apply for expungement after the legal time limit for charging the crime has expired. That legal time limit is at least three years for felonies. Some felonies have very long periods and some have no time limit at all. This results in a minimum three years of exposure on the database before the process can even begin.
Once the request is filed, there is a mandatory 180-day waiting period. The court will then determine if there are any objections to the expungement by either the prosecutor or California Department of Justice. It is up to the court’s discretion whether or not the government can exercise its veto. At that point, decisions are final. Appeals are not available. Those procedural hurdles make it extremely unlikely a large number of people will even apply for expungement while guaranteeing substantially fewer succeed, resulting in a lifetime of surveillance where their DNA is checked for every crime where DNA is available.
The laws concerning data collection into state databases differ dramatically. In Alaska, someone arrested for any crime against a person, irrespective of whether it’s a felony or misdemeanor, will have a DNA profile taken. In South Carolina, peeping, eavesdropping, and stalking are eligible for DNA checks. South Dakota, Texas, and Vermont actually have laws preventing the use of DNA within state databases from being used for the prediction of underlying medical or genetic disorders.
Some cops further end-run restrictions put in place with CODIS and state-run databases by simply creating their own, just as Orange County did. A class action lawsuit alleges that in New York, cops have been secretly collecting DNA from suspects, including from children. The detectives have been offering people drinks during interrogations and pulling the DNA from the drinks, even if the person had declined to consent to provide a DNA sample. It’s alleged that those samples are placed in a “suspect index” for indefinite storage regardless of actual guilt or innocence.
Maryland v. King
The U.S. Supreme Court had an opportunity to slow down the process in Maryland v. King, 569 U.S. 435 (2013), but chose not to do so in a five to four decision. The case began when Alonzo King was arrested on April 10, 2009, and charged with menacing a group of people with a shotgun. As a part of the procedure during his arrest, his cheek was swabbed for a DNA profile. Months later, in August of 2009, the DNA profile was entered into CODIS, and a match was found. A sample taken from a 2003 rape was identified as matching his DNA. King attempted to suppress the DNA evidence, which had been confirmed by a second swabbing, on the basis that the Maryland DNA Collection Act that authorized the collection violates the Fourth Amendment.
Justices Kennedy, Roberts, Thomas, Breyer, and Alito found that the DNA extracted from King was for the purposes of identifying him, which they found to be a legitimate government interest. Furthermore, since the DNA buccal swab is painless, any privacy intrusion is a minimal one. To the majority, a DNA search is little different from fingerprinting.
Justices Scalia, Ginsburg, Sotamayor, and Kagan dissented, with Scalia providing the reasoning. Scalia argued that, rather than identification, the DNA extracted was in fact a search for Fourth Amendment purposes. He pointed out that had it genuinely been a form of identification used to determine, as the majority concluded, exactly who the jail held in its cells, it was too late for that. In contrast, when the DNA was entered into CODIS, no identifying information was included. Instead, what happened was a search for unsolved crimes.
Touch DNA
Since the King decision, cops have expanded the methods with which they acquire DNA samples. As in the New York City example, they have famously acquired DNA from cups, expelled gum, and discarded tissues in the trash. The techniques for extracting meaningful DNA from small amounts of material have reached remarkable levels. It is now possible to pull DNA from just the few cells found on anything a person has touched. Touch DNA is enabled through the chemical process of multiplying the number of available DNA copies to reach readable levels. Of course, the amplification of any target DNA runs the risk of also amplifying any other DNA that has been deposited or transferred onto the item, resulting in a form of contamination.
A prime example of Touch DNA and its potential ramifications is the Amanda Knox case, in which she was convicted for the murder of her British roommate Meredith Kercher in Perugia, Italy. Also accused were Rudy Guede and Rafael Sollecito. The evidence against Guede was strong. Based on palm prints, fingerprints, and his DNA on the victim, he was found guilty. However, miniscule amounts of DNA, Touch DNA, was found on the clasp of Kercher’s bra. Likewise, a knife in Sollecito’s kitchen drawer held DNA from Knox on the handle and Touch DNA from Kercher on the blade.
Boise State University (“BSU”) researcher Greg Hampikian took a look at the Italian crime scene analysis and found several problems. The first was that the bra clasp had been picked up, passed around by investigators, then returned to the floor for photographing. It was not collected for DNA testing for another 46 days. At any point in the process, it could have picked up the few traces of Sollecito’s DNA. And while the knife, which Knox used for cooking, had plenty of Knox’s DNA on the handle, the blade had very little of Kercher’s, less than half the amount needed for the FBI to consider it valid.
Hampikian’s students mimicked part of the Italian investigation by collecting five soda cans from BSU’s Dean of Arts and Sciences after lunch. They put each of the cans in individual evidence bags and—without changing gloves—went on to place five brand new knives into their own individual evidence bags. They limited their DNA review to levels below the FBI’s minimum and found one of the knives had DNA of one of the Dean’s staff even though he had never been in the same room with the knives, much less touched them, representing an innocent transfer of DNA that can be very difficult to impossible to explain away in court.
Another example of Touch DNA can be found in People v. Luna, 989 N.E.2d 655 (Ill. App. Ct. 2013), where Luna was convicted on the basis of DNA extracted from partially eaten chicken bones. At multiple points in the roughly five years between the evidence collection and the DNA tests, the bones passed between reviewers. One review happened at the Chicago Field Museum where the partially thawed chicken was handled by an ornithologist and colleague without gloves or masks. The two pulled back meat from some the bones and removed it from others. All this was done on an unsterilized table in a semipublic area.
The DNA extracted unsurprisingly showed signs of mixing. At the time in 1998, labs were using only nine loci to determine identity. Some of the nine loci on the DNA showed more than two peaks. Recall that the two peaks in an allele each come from one of the individual’s parents. A third peak, then, indicates another contributor. With further amplification, the lab was able to find four more loci to fill out the nine necessary for identification. From the alleles of the “major contributor,” the labs determined that the found DNA would only occur in 1 in 2.8 trillion (2,8000,000,000,000) Hispanics. The lab’s founder, Dr. Karl Reich, nevertheless pointed out CODIS returned 903 pairs of matches for those same nine loci within the database, “i.e., 1,806 persons with identical DNA at nine loci.”
The above case presents two major questions with regard to DNA. The important one here (since the FBI now requires 13 loci and is moving toward 20) is dealing with mix DNA from contamination. In 2013, Michael Coble, a geneticist at the National Institute of Science and Technology (“NIST”) asked 108 labs to determine if the DNA of a specific robber was found on a ski mask bearing the DNA of several people. Seventy-three of the labs, a full two thirds (2/3), got it wrong—claiming that the DNA was there when it was not.
Professor Hampikian asked 17 analysts “at a reputable lab” to interpret DNA from a case where a man was convicted for possessing two matching alleles (as well as an implication from an actual participant) from a gang rape crime scene. The man, Kerry Robinson, was serving 20 years in a Georgia prison for the crime. Twelve of the 17 analysts excluded Robinson. Four analysts claimed to be unable to draw a conclusion. Only one attributed Robinson to the crime. Hampikian also concurrently submitted the DNA of four employees of a local TV news station. All four had two alleles in common with the mix DNA. One, a woman, had three alleles in common.
Abandoned DNA
Humans release millions of cells a day, leaving open a tremendous number of chances to capture “abandoned” DNA. Cops across the country have leaned heavily on the abandonment doctrine for DNA collection. The abandonment doctrine is used by police when they search a vehicle found on the side of a road or an unattended suitcase in an airport. In this case, the courts have begun to see abandonment with the idea that Fourth Amendment protections extend only to areas where they have a reasonable expectation of privacy, though the appropriateness of that standard has been called into question for DNA.
Judge Alex Kozinski, in his dissent for United States v. Kincade, 397 F.3d 813, 873 (9th Cir. 2004), observed, “We can’t go anywhere without leaving a bread-crumb trail of identifying DNA matter. If we have no legitimate expectation of privacy in such bodily material, what possible impediment can there be to having the government collect what we leave behind, extract its DNA signature, and enhance CODIS to include everyone?”
Our depositing of DNA is involuntary. It is impossible to physically contact an object with skin or saliva and not leave behind DNA without also attempting to sterilize it. While it may be fairly simple to draw an analogy between the “abandonment” of physical items left in a trash bag on a curb, a chewed piece of gum, and a used cigarette butt tossed onto the street, the collection of DNA left on a soda can during a police interrogation—especially after refusing to give consent—is a very different issue.
In United States v. Thomas, 864 F.2d 843 (D.C. Cir. 1989), Chief Judge Wald states in the opinion, “To determine whether there is abandonment in the fourth amendment sense, the district court must focus on the intent of the person who is alleged to have abandoned the place or object.” In the case of the cup in New York City or even with a chewed piece of gum, there is no intent to discard DNA.
Rather than abandonment in the strict sense, discarded DNA may seem more analogous to fingerprinting. After all, we frequently leave behind fingerprints on items we touch, such as paper, tape, and soda cans. DNA, however, offers not just the person’s identity but also their phenotype (disease profile, eye and skin color, etc.) as well any consanguinity. If each person has around 850 relevant family members (1.17 in 1,000) and the U.S. incarcerates seven out of every one thousand people, then it is possible that a simple piece of “abandoned” DNA could lead to several suspects who were never considered in the original acquisition. While that may not be possible with FBI’s CODIS today, as it does not allow either phenotypic or filial searches, it is not beyond the efforts of some police as well as some armchair investigators using state, local, or private databases.
DNA Dragnets
Given the strength of DNA in associating people with crimes, cops are always looking for DNA to fill their databases. Cops have begun holding DNA drives where people are handed free kits that they can fill out and have their cheeks swabbed. Often, the DNA drives are used to help identify human remains. Part of CODIS stores the DNA of unidentified remains. If someone is missing a sister or cousin, then when passed through CODIS’ database, the DNA would compare against any remains within the database at the time or added later. But CODIS does not check the family DNA against crime DNA in the database.
Cop DNA dragnets are not always limited to CODIS for their submissions. It may be added to the state and/or local databases where it can be compared against crimes. It may also be submitted to databases that compare filial relationships. The DNA drive could have ostensibly targeted identifying human remains and missing persons, but once in cop hands, the DNA may not be limited to that search. Albert Scherr, a University of New Hampshire law professor, points out, “Historically, when the police get a hold of someone’s DNA, they don’t let go of it.” A recent DNA drive in DeKalb County, Georgia, submitted its volunteered DNA to CODIS and private genetic genealogy databases.
In addition to the “Spit and Acquit” dragnets like in Orange County, Newsweek has reported a “Knock and Spit” program by the NYPD. Others have been asked to consent to DNA swabs during traffic stops. Jamie Williams reported for the Electronic Frontier Foundation in 2017 that the San Diego Police Department was bypassing the restrictions of California’s permissive DNA collection by using local databases. “[San Diego Police Department’s (SDPD’s)] policy seems to intentionally sidestep the minimal protections the California legislature built into California’s DNA collection law.” The cops found their way around it in order to collect the DNA of Black children.
Actual consent is a thorny enough question without involving children. In Schneckloth v. Bustamonte, 412 U.S. 218 (1973), Justice Marshall dissented saying, “[A]ll the police must do is conduct what will inevitably be a charade of asking for consent. If they display any firmness at all, a verbal expression of assent will undoubtedly be forthcoming.”
He has since been proven correct. In 2021, Adam Schwartz of the Electronic Frontier Foundation found that “statistics on all traffic stops in Illinois—for 2015, 2016, 2017, and 2018—show that about 85% of white drivers and about 88% of minority drivers grant consent.” People have a natural tendency to comply with authority, making it difficult to evaluate the actual voluntariness of the consent, especially when pressed by a uniformed, armed officer who persuades them via a commitment to obtaining a warrant anyway. Many people may not even be aware of their right to refuse. Even if they are made aware of that right, they may still worry that doing so makes them appear guilty.
Andrea Roth wrote in the Harvard Journal of Law and Technology in “The Many Revolutions of Carpenter” about a “Spit and Acquit” participant who said, “‘I’m not a murderer,’ meaning she saw no issue with giving up her DNA if she does not plan to commit a DNA-solvable offense.” The waiver she signed did not mention the possibility that contamination, innocent presence, DNA transfer (like the bra clasp in the Knox case) could implicate her to in a crime. She also may not be aware that her DNA can be used to triangulate a distant cousin she never met for an abortion prosecution. The unintended and unforeseen consequences of providing DNA samples are many, can be quite serious, and can ensnare even the innocent in a waking nightmare from which one may never escape once the unforgiving machinery of the criminal justice system is unleashed upon them.
Genetic Genealogy
The Golden State Killer’s serial rapes and murders had infamously gone cold until DNA evidence from one of the murders was uploaded to several commercial genetic genealogy databases, though police initially credited only GEDmatch. Barbara Rae-Venter, a retired patent attorney with a Ph.D. in Biology, helped the police to capture the murderer, although she originally did not want her participation to be made public.
The big break in the case came when investigators uploaded DNA from a double murder to MyHeritage, according to the Los Angeles Times. They were able to tie the DNA to a distant cousin who they used to generate an extensive family tree. Additionally, they developed phenotypes from the genetic material to provide clues, including the fact the suspect had blue eyes, early baldness, and specific health risks. Selecting from circumstantial evidence like presence in California at the time of the murders helped to identify the killer.
After at least one consent-based DNA submission of a family member and a dig through the trash for a discarded tissue, they landed on Joseph DeAngelo, a 72-year-old retired cop. The success of the technique has opened the floodgates for genetic genealogy, spawning hobbyist and professional genetic genealogists alike. Despite the rapid growth of the field, and some interim guidelines by the Department of Justice in 2019, even prominent genetic genealogist CeCe Moore admitted in 2023, the previous five years had been the “Wild West.”
Genetic genealogy works because people willingly submit their DNA to websites like Ancestry.com and others. Their eagerness to learn of their heritage, disease profiles, and more also allows them to find lost family connections and sometimes disconnections such as childbearing infidelities. The access to so much data will always be appealing to cops looking for new ways to surveil citizens.
Not all sites allow cops unrestricted access. Ancestry.com and 23andMe both restrict police access to those with court orders. Both companies also produce transparency reports that detail cop access. The company MyHeritage, one of the key companies in the Golden State Killer case, also restricts police access, but it does allow users to upload DNA results from other websites. That gives unethical cops and genetic genealogists the opportunity to spoof the system by uploading anyone’s DNA results, through which they can peruse the generated connections among MyHeritage’s seven million profiles.
In contrast, GEDmatch and FamilyTreeDNA allow police and genetic genealogists access but give the users the opportunity to opt out. While GEDmatch automatically opts its user out unless there is an explicit effort to opt in, FamilyTreeDNA takes the opposite approach and does not make it obvious how to opt out, according to Leah Larkin, a privacy advocate who once worked as a genetic genealogist but no longer practices. Larkin estimates that 700,000 profiles are opted in for GEDmatch, and that number is probably much greater for FamilyTreeDNA. Neither company advertises the numbers.
The police and genetic genealogists may not always have explicitly been granted permission to access the websites, but that has not stopped them. When Rae-Venter accessed MyHeritage, she did not notify the company of her intentions. She told the Los Angeles Times that her use of the system was approved by the FBI’s Los Angeles division counsel at the time, Steve Kramer. “In his opinion, law enforcement is entitled to go where the public goes,” she said.
Department of Justice guidelines say that genetic genealogy should only be used for crimes like rape and murder and only when all other reasonable avenues of investigation have failed, but those are just suggestions. CNN reported in May of 2023 of how a U.S. marshal used genetic genealogy to solve an ancient prison break. It was a case that Larkin points out in her blog, DNA Geek, that should never have used genetic genealogy in the first place.
In June of 2023, another violation of a company’s terms of service was reported. The Riverside County Regional Cold Case Homicide Team, working with the FBI, failed to get a court order as they accessed the MyHeritage website. Neither MyHeritage, the Riverside DA or Sheriff’s Office, nor the FBI felt a need to explain their positions when The Intercept asked about it. As Larkin explains on her blog, “The case presents an example of ‘noble cause bias,’ in which the investigators seem to feel that their objective is so worthy that they can break rules in place to protect others.”
There have also been times when the companies appear to have been complicit in the violation of their own terms of service or at least turned a blind eye to it. Curtis Rogers, cofounder of GEDmatch, allowed Utah police to access the full website to solve an assault case. Rogers excused his unilateral decision to violate the trust of his users who had opted out by telling BuzzFeed, it was a tough call with a case that “was as close to a homicide as you can get.”
The Intercept reports being told that genetic genealogists have sent sample kits to the possible relatives of suspects. In some cases, if the relative already had a profile on Ancestry.com, they simply asked for access to the account, taking advantage of the naiveté of the relatives who have no idea what they are giving up as the genealogists cynically bypass the courts.
Cops, and the genetic genealogists working with them, prefer to keep the use of genetic genealogy quiet or at least not subject the practice to scrutiny by the courts. That is the key reason they avoid getting the court orders needed to access Ancestry.com with its 23 million profiles. They really do not want the cases subjected to the crucible of court review. In the self-mythology CeCe Moore fed to Rafil Kroll-Zaidi for a New York Times Magazine article, Kroll-Zaidi states, “Rather than commit everything to notes, she leans heavily on a capacious working memory.” That is convenient. Furthermore, police argue that genetic genealogy is no different from a tip by an informant, so it does not require disclosure anyway.
Bryan Kohlberger is charged with the brutal murders of four university students in Idaho in 2022. The probable cause statement used for his arrest never even mentions the genetic genealogy used to find him. For months afterwards, law enforcement refused to disclose its use. The state has reportedly refused to hand over any records despite Kohlberg’s defense team’s requests. According to the state, not only does the defense not have a right to the data, the work was performed by the FBI and “few records were generated in the process.”
They also argue that, as one might expect, when plundering the DNA of potentially millions of unsuspecting—yet somehow suspected—people, the family tree generated was extensive, including “the names and personal information of … hundreds of innocent relatives.” The state’s claim, then, that the privacy of those people needs to be protected seems to directly contradict their stand that some form of court order, which would have generated records as it protected people from an unwarranted search, was not needed.
One of Kohlberger’s attorneys wrote, “It would appear that the state is acknowledging that the companies are providing personal information to the state and that those companies and the government would suffer if the public were to realize it. ‘The statement by the government implies that the databases searched may be ones that law enforcement is specifically barred from and do not want to disclose their methods.’”
GEDmatch posted a warning in 2018 that cops had been posting suspect DNA profiles to search the database. In order to avoid further bad press, it changed its website, so users are opted out from police searches by default. CeCe Moore told CNN that the new policy meant, “People will die.”
The Intercept, however, found communications between Moore and other genetic genealogists exploiting a loophole in GEDmatch’s website. It turned out that one or more of the reports they had been using prior to the change continued to work without regard to whether or not the search targets had opted in. One of the genetic genealogists, Margaret Press, in the communications even talked about how her organization had hidden the fact that they had used the loophole.
Tiffany Roy, a DNA expert and attorney, told The Intercept, “If we can’t trust these practitioners, we certainly cannot trust law enforcement.” She went on to add, “These investigations have serious consequences; they involve people who have never been suspected of a crime.” Warrants, therefore, should be required, and “[a]nything less is a serious violation of privacy.”
The GEDmatch loophole is not obvious or one that is triggered by mistake. According to The Intercept, which was shown how to trigger it, “Rather, it was a backdoor that required experience with the platform’s various tools to open.” GEDmatch apparently attempted to patch the backdoor, after a hack in 2020 that opted in every account, and asked genetic genealogist Joan Hanlon to see if it worked. Among the communications The Intercept received, Moore and Margaret Press—cofounder of the DNA Doe Project—discussed with Hanlon how to trigger the backdoor. However, The Intercept was unable to determine whether anyone ever bothered to notify the company regarding their miss.
Moore emphasized to Kroll-Zaidi how she and her team at Paragon play by the rules and respect safeguards. After all, she told the inaugural conference at Ramapo Colleges, one of two institutions to offer instruction in genetic genealogy, “With this incredibly powerful tool comes immense responsibility.” People would likely be alarmed to know that the cops along with their teams of armchair detectives are rifling through their most intimate data even if they have opted out of the cops’ view. Moore added, “If we lose public trust, we will lose this tool.” Unless, of course, no one acknowledges their involvement, how they tracked someone down, or even which databases they used. To those cops and hobbyist detectives, the civil rights and privacy of the people they capture as well as all the completely innocent people whose DNA is stored on these databases are secondary to society’s interest in deterring crime. Once again, we see noble cause bias infecting this entire investigatory approach to solving crimes.
Fractured Rules
These uses of genetic data fail to consider potential harms to society. They often assume that if a person has nothing to hide, they have nothing to fear. But that, as we have seen, is far from the truth. The Louisiana rape victim might not have submitted to the exam if she had known the cops would use her DNA to incarcerate her brother. Neither would the San Francisco woman have likely submitted to her own exam. Any person who participates in a genetic genealogy site, DNA drive, or a “Spit and Acquit” program is at the least doing what economists call “Discounting the Future.” They have no idea if they may one day participate in a retail theft, innocently deposit their DNA at as crime scene, or—like the woman who had a miscarriage—discover that their DNA was used to track them down because of a miscarried fetus in the waste water. They may also do harm to a family member who lives in another state as there remains a possibility that states may try to prosecute women for seeking abortions.
Despite the ongoing debate, most people profoundly overestimate the rules by which law enforcement is forced to play. Susan M. Wolf, a professor of law and health policy, told an audience at the University of Minnesota, “We’ve got 50 states. We’ve got multiple federal agencies involved.” She asked rhetorically, “What is the law of genomics?” The light patchwork of regulation means that genetic privacy is almost never regulated.
Access to genetic data is regulated by how it is generated and who is accessing it. If, for some reason, a person submits their data to a researcher and that data is stolen or hacked, they are protected from their DNA information being used in court. However, if it is part of your health record, the Health Insurance Portability and Accountability Act (“HIPAA”) allows the police to access your DNA without a warrant. Furthermore, your DNA can be used against you for long-term care insurance, life insurance, and disability insurance but not for your health insurance (at least until you become sick from a genetic illness). It is legal in every state, except California, for your condo association to demand a genetic screening for Alzheimer’s as part of an application, which may also be subjected to a search by law enforcement under the third-party doctrine.
Genetic genealogy offers another novel legal challenge. DNA that has been “de-identified” may be sold to researchers, data brokers, and others by genetic genealogy websites. The problem is made plain by the mere existence of genetic genealogy. DNA is identity in a way nothing else is. Location data, which can identify someone with just a few datapoints, can tell things like personal habits, religion, and entertainment preferences. However, DNA can single out odds for early-onset dementia, ancestral origins, skin color, and diabetic risks.
Parabon NanoLabs, the company employing CeCe Moore, has developed a tool that attempts to predict how a person looks at a certain age based on a DNA sample and then generates an image cops have passed through facial recognition technology. Unlike location data, a single DNA source is adequate to identify the approximately 850 people tied to that person’s DNA network. If location data implicates a person’s privacy rights, then might DNA potentially violate the privacy rights of an entire network?
The state of Maryland took a stand for genetic privacy in 2021. Lawmakers passed a comprehensive law requiring police to obtain a warrant before they do a genetic genealogy search. They must also show that all other investigative avenues have failed and notify the court when they plan to acquire the suspect’s DNA. Cops in Maryland are barred from mining garbage for used tissues, tailing someone to acquire discarded gum, or even offering a drink during an interrogation just to surreptitiously obtain “abandoned” DNA. Maryland law enforcement must also obtain consent before using a third party’s DNA to solve a crime within someone’s DNA network. That rule would have prevented a notorious example in Florida where cops lied to a suspect’s parents, claiming to need to obtain a DNA sample to identify a potentially dead relative.
The law was a great idea and served as a model for a couple of other states, though they did not go quite as far as Maryland. Neither has Maryland, apparently. The Maryland Department of Health had “quietly stopped implementing key parts” of the law by the following year. Part of the law required the government to disclose police access to ancestry data, but they failed to do so. They also suspended a task force working to iron out new regulations. The department simply made the decision to stop its implementation.
It appears to be that for the fiscal year of 2023, which extended through June of that year, the Office of Health Care Quality tasked with implementing the law was not given the funding they needed. At one point, the lab director of the Maryland State Police complained, “They have gone silent, and I’ve tried every avenue available to me to get some resolution without success.”
Some people see the possibilities for genetic genealogy and DNA databases as just coming of age. CeCe Moore, the actor/director and genetic genealogist, believes that it is already a settled question as to whether we have a right to genetic privacy. She believes 30 million people made the decision already, and we do not. Going forward, she would like to see the use expand to a near pre-crime status as she believes it can eradicate crime via earlier use in the crime solving process. “We can stop criminals in their tracks,” she told Ramapo College. “I really believe we can stop serial killers from existing, stop serial rapists from existing.”
Moore told Ramapo College that by the time people started worrying about genetic genealogy policy it was too late. “[W]e were allowed to build this powerful tool without interference.” She went on, “We are an army. We can do this! So, repeat after me: No more serial killers!” However, others lament with equal passion, “No more privacy!”
Sources: Gemini, TheIntercept.com, FBI.gov, findlaw.com, ACLUNC.org, Science.org, NYTimes.com, Wired.com, wustlawreview.org, nij.ojp.gov, Biometrica.com, oag.ca.gov, dps.texas.gov, kslnenewsradio.com, EFF.org, wmar2news.com
As a digital subscriber to Criminal Legal News, you can access full text and downloads for this and other premium content.
Already a subscriber? Login