Study: Bloodstain Pattern Analyses Display Alarming Lack of Accuracy
by Michael Fortino, Ph.D.
In the summer of 2000, after discovering his wife and two small children fatally shot in the family’s car inside their garage, former Indiana State Trooper David Camm was arrested and charged with the murders based entirely on the opinion of an expert that bloodstains on Camm’s shirt were consistent with those that would be found on whoever pulled the trigger. Despite a solid alibi and DNA evidence found at the scene that excluded him, he was convicted at trial.
But then Bloodstain Pattern Analysis (“BPA”)—the 150-year old forensic technique used in Camm’s case—suffered “a black eye,” as one BPA instructor put it, when the conviction was overturned four years later, after prosecutorial misconduct was uncovered that shed a more favorable light on conflicting testimony from BPA experts for the defense, who had argued that the bloodstains were transferred to Camm’s shirt when he moved the bodies of his children in hopes of saving at least one of them.
This problem of contradictory results from analysts trained in the same forensic technique now has a frequency assigned to it—it happens 7.8 percent of the time—thanks to the first-ever comprehensive study of BPA. Reported in the August 2021 volume of Forensic Science International,the study also documented an alarming error rate in this so-called science: BPA misidentified the true cause of a bloodstain in 11.2 percent of the cases where it was known.
BPA was first allowed by a U.S. court in the 1857 case of State v. Knight, 43 Me. 11 (1857), when the Maine Supreme Court accepted testimony from a doctor about knife wounds found on Mary Knight that had been used to help convict her husband, George, of her fatal stabbing. BPA has since come to be recognized as the gold standard of scientific analysis in the evaluation of bullet or knife trajectory, struggle or confrontation settings, and of course, blood stain identifiers. It has decided the fate of many a suspect, some guilty, others innocent.
BPA is often relied upon to determine whether a death was a suicide or homicide, meaning an erroneous analysis could significantly mislead a jury. Analysts present conclusions with a high degree of certitude at a criminal trial, often with a life or death sentence for the accused at stake. But during the century and a half it has been in use, no large-scale study has rigorously assessed the accuracy and reproducibility of BPA conclusions—until now.
The new study was conducted by researchers from Noblis, a private firm based in Virginia, as well as Indiana University and the Kansas City Police Department’s crime lab. Participants were a diverse group of 75 practicing analysts trained in BPA, chosen randomly. Though they represented 14 nations, most—57 percent—were from the U.S. Generally, their BPA work is only one of their lab responsibilities. In practice, just under half—47 percent—reported performing fewer than five BPA cases per year. Yet 83 percent said they had testified in court as “expert witnesses” on BPA evidence.
Overall, participants correctly identified a known cause often enough that BPA showed its predictive value to be correct 83 percent of the time—86.6 percent of the time in those “most consequential” questions. That’s good, but not perfect, especially when dealing with life and death considerations.
Here’s what happened: Each participant was given samples to study from a pool of 192 bloodstain patterns, with instructions to assess how each one was caused. Called “pattern classification,” this process involves studying the direction, distance, and intensity of blood splatter. It was the only aspect of BPA fully considered in the study. In their regular work, analysts also rely on other factors, as well as facts relevant to each case, none of which were provided to the study’s participants. However, some other BPA aspects, such as reconstruction, were also involved in the study via prompts like “Was the decedent standing up when (the) bloodletting event occurred?”
Of the 192 samples, 123 were “controlled collection samples,” meaning the research team—though not the participants—knew how the pattern was actually produced. The remaining 69 samples were taken from actual casework, where the cause was not definitively known. The study could not determine the accuracy of these unknown samples, of course, only those for which a mechanism had previously been identified. But researchers were also concerned with how often and well different analysts reproduced one another’s results, and that analysis could be conducted on all sample types.
Before digging into the results, it should be noted that the vast majority of participants—72 percent—suggested that the samples they were given to evaluate were similar in the difficulty they posed to bloodstain samples encountered in their regular casework. Another 23 percent said the samples were harder to analyze than their typical casework.
All told, the 75 participants gave 33,005 multiple-choice and yes/no responses to a series classification prompts and questions, which asked them to indicate whether a particular mechanism (e.g., stabbing) was either “definitive” or “excluded” for each sample. They also had the option to split the difference between those categorizations by saying the cause was neither definitive nor excluded, but it could be simply “included.” In addition, participants provided 1,760 “short-text” responses, which were more similar to their typical BPA work.
As to the question of accuracy, the research team found a high rate of error in the multiple-choice responses—11.2 percent. Even when researchers narrowed their analysis to those questions deemed “most consequential” if they were presented in an actual case, participants misclassified a pattern 9 percent of the time.
The team also found that of the short-text statements about a pattern for which there was a known cause—1,052 of the 1,760 total—participants “entirely contradicted” the known cause 4.8 percent of the time and “partially contradicted” it another 11.2 percent of the time.
Further, in assessing whether analysts’ observations and conclusions were adequately supported, researchers determined that 11.3 percent “had errors in reconstruction statements, observations, or unsupported conclusions.”
The research team concluded that participants exhibited “a continuum of performance errors,” with all of those who reviewed more than 50 samples making “multiple errors.” Of particular note were two analysts whose responses were anomalously incorrect—one alone was responsible for 5.7 percent of all the study’s errors—yet both regularly conduct BPA in their work and testify as BPA experts.
Results on the issue of reproducibility were even more disturbing, with participants providing identical answers to the same question or prompt just 54.6 percent of the time. This “Overall Agreement Rate” rose only to 56.3 percent when limited to the “most consequential” questions and prompts.
Not all participants studied the same samples, remember. But when the analysis was limited to those samples that everyone saw, the results were overwhelming and troubling: Participants reached different answers to the same questions and prompts 97 percent of the time.
Researchers also considered the “Overall Contradiction Rate,” which was the proportion of participants who had a diametrically opposed response—totaling 7.8 percent of all prompts, and 6.2 percent of those deemed “most consequential.”
This variance suggested a lack of consensus among participants that researchers found especially alarming, since two analysts each will often produce their own determination of the mechanism that created a particular blood pattern before court trials—much as the BPA experts for the prosecution and defense in David Camm’s murder trial came to opposite conclusions about the blood found on his shirt. As the study concluded, “[t]hese results suggest that if two BPA analysts both analyze a pattern (such as occurs operationally during technical review) they cannot always be expected to agree, and if they do agree they may both be wrong.”
Not all errors were technical in nature, however. The research team qualified the results of the study by saying that “many of the disagreements—and some of the errors—may be attributed to semantic differences rather than contradictory interpretations.” For instance, there is “inadequate delineation” between “splash” and “drip” patterns, the team noted, and “some definitions are ambiguous.” How much blood is needed to call pattern a “pool” rather than a “saturation stain,” for example? There is no generally accepted answer.
“Although some semantic disagreements would presumably be unlikely to have significant consequences in actual casework,” the researchers conceded, “their prevalence obscures the extent of serious disagreements.”
“This lack of agreement on the meaning and usage of BPA terminology and patterns,” the team warned, “illustrates the need for improved standards.”
Because of the limitations of their study’s methodology, researchers caution that their results “should not be taken to be precise measures of operational error rates.” To determine actual operational error rates would be nearly impossible as the vast majority of casework involves samples whose cause cannot be known with anything approaching certainty.
Unfortunately, many defendants’ lives depend on this crucial point in a criminal trial, when a jury is “enlightened” by this so-called science. The study, despite its limitations, casts serious doubt on the reliability of conclusions drawn by BPA experts, which courts use to determine guilt or innocence. In most of these cases, BPA evidence is an important factor—if not the only one—used to decipher the events surrounding a heinous violent crime.
Unlike other forensic disciplines such as DNA, fingerprinting, or even fiber analysis, BPA is used not to identify a suspect but rather the series of events that unfolded in a crime. Because of its relevance, BPA will remain an invaluable component of crime scene analysis, but it should not be the only component.
In fact, a 2009 National Academy of Sciences study cast doubt on the validity of using BPA exclusively in crime scene investigation, finding that “in general, the opinions of bloodstain pattern analysis are more subjective than scientific” and that “the uncertainties associated with [BPA] are enormous.”
A 2016 report by the President’s Council of Advisors on Science and Technology found that several commonly used forensic methods, including BPA, fail “an objective test of scientific validity” with “dismaying frequency.”
The conclusions drawn by BPA analysts should be viewed with intense suspicion, and the standards and protocols associated with this forensic science must be subjected to review and formidable revision using advanced technology. As an independent crime scene science, BPA by itself suffers the same lack of evidentiary certainty as “shaken-baby syndrome” and may end up relegated to the dustbin of history with other discredited, yet once glorified, forensic disciplines like bite-mark analysis.
Prosecutors zealous for a conviction sometimes end up letting the innocent be declared guilty, and when given free reign to use pseudo-science in court, that is the harm that can result. In David Camm’s case, jurors called on to decide between conflicting BPA conclusions sided with those experts that prosecutors called to the stand. Unfortunately, the jury didn’t know that investigators had failed to search for a match to that DNA evidence found at the scene that excluded Camm. When the search was finally run, it returned a hit on a known criminal with a foot fetish—Mrs. Camm’s shoes were found removed from her feet and displayed on the car’s roof—who then cut a deal with prosecutors to finger Camm as the killer, sending him back to prison for another nine years before he was finally acquitted and freed in a third trial in 2013.
Simply put, we as an advanced society, are better than this, and our defendants, regardless of society’s desire to prosecute, deserve evidentiary due process. After all, we are innocent until proven guilty and not the other way around.
Source: Accuracy and reproducibility of conclusions by forensic bloodstain pattern analysis,R.A. Hicklin, K.R. Winer, P.E Kish, et al, Forensic Science International 325 (2021) 110856.
Additional sources: reason.com, sciencedirect.com, techdirt.com, investigatinginnocenceblog.com
As a digital subscriber to Criminal Legal News, you can access full text and downloads for this and other premium content.
Already a subscriber? Login