Perils of Risk Assessment Tools in Criminal Justice
by Jayson Hawkins
Risk assessment tools have frequently been mentioned as useful aids in the push toward criminal justice reform, yet little research has been done concerning their accuracy, validity, or effectiveness in this area.
In response to legislation in several states that would make the use of such tools mandatory, the Partnership on Artificial Intelligence (“PAI”) issued a report revealing the potential pitfalls of attempting to replace human judgment with an algorithm.
PAI is a nonprofit organization intended to facilitate open debate about the potential impact of artificial intelligence technologies on individuals and society as a whole. It consists of over 80 members who cover the spectrum, from AI research labs to civil society groups. Their goal is to make sure that such technology is utilized in the best way possible and in a manner that ultimately benefits humanity. Because risk assessment tools are considered basic forms of AI, the PAI has an interest in how they will be employed.
The PAI report defines risk assessments as “statistical models used to predict the probability of a particular future outcome.” This is achieved by formulating a risk score based on an individual’s information (age, education, work and criminal history, etc.) which is measured against a database. The individual is then ranked in a bracket of risk either on a scale (1-10) or into broader categories like “high,” “medium,” or “low” risk.
Recent legislation has focused on risk assessment applications in two primary areas of criminal justice decision-making. The first centers on pretrial bail decisions. The standard cash bail system has come under fire due to its outsized impact on disadvantaged and minority groups. Legislation like the California Rail Reform Act (S.A. 10) was enacted to eliminate cash bail in favor of using a risk assessment tool to determine pretrial release. It was this bill in particular that prompted PAI to issue its report, and the law has since been delayed pending the result of a 2020 ballot measure.
The other area is parole. The First Step Act of 2018 is a federal law that, among other things, requires the Attorney General to implement a new risk assessment system intended to keep the federal prison population at a manageable level. Other jurisdictions have also turned to risk assessments for parole determinations, though the results thus far have been a mixed bag.
The movement to reduce disproportionate rates of incarceration in the U.S. has turned to risk assessments for several reasons. Advocates for using these tools have cited fewer inefficiencies, lower costs, and more objective and repeatable results in the decision-making process as reasons to adopt the technology. It is hoped that ultimately risk assessments will create a more equitable system that imprisons fewer people.
Critics agree with the importance of reaching this goal, yet many remain skeptical about the potential of risk assessment tools in achieving it. Algorithms now in use have revealed problems with bias, accuracy, and validity; additionally, the tools are frequently applied to tasks for which they were not designed and seldom does transparency or adequate review exist. While steps toward improvement could be made to alleviate these concerns, many remain doubtful they could be resolved merely by building a better mousetrap.
PAI’s report details minimum requirements that must be met before its members would consider supporting the application of risk assessments to criminal justice decisions. These requirements hinge upon three factors—accuracy, validity, and bias—each of which has a specified meaning in the report.
Accuracy is defined as the model’s performance compared to an accepted baseline or predefined correct answer based on the dataset available.” In other words, accuracy measures how close the risk assessment’s predictions are to what is expected, considering the information it is given. Although conventional wisdom holds that getting the answer right would be the most important factor, accuracy does not take into consideration if it is operating on the correct data or even if the right question is being asked. An assessment’s prediction can be perfectly accurate yet fail to provide a relevant answer.
More crucial to the application of algorithmic tools is their validity, defined as their “fidelity to the real world.” That means how near their predictions come to answering the question that is asked. A lack of validity may point to problems with either data or application. A tool designed to assess recidivism may not be valid in one location when operating on a nationwide dataset. Likewise, the same tool would not be valid if it were used to predict the chances of re-arrest in a pretrial context rather than following conviction and incarceration.
Bias, when related to risk assessments, means that a tool’s “predicted probabilities are systematically either too high or too low for specific subpopulations.” This factor most often becomes an issue when an assessment displays disparities in race, but it also applies to socio-economic classes, gender, age, and other demographics. It should be mentioned that the algorithms themselves are not capable of bias; however, they do reflect biases inherent in their programming, the data they are given to assess, or in how their predictions are interpreted.
The picture that emerges from examining these factors is that risk assessments can only be as accurate, valid, and unbiased as the individuals who program them, select their data, and determine how they are to be used. Although the tools are intended to present a more fair and objective standard in making criminal justice decisions, human prejudice and fallibility are present at every stage of their development and implementation.
These shortcomings have sparked controversy over the use of risk assessments, especially when people’s freedom hangs in the balance. Such tools are unlikely to ever meet the ideal of absolute fairness, yet that does not mean they have no value. As a baseline for comparison, PAI suggests measuring the tools against existing systems, i.e., do risk assessments present “an improvement over current processes and human decision-makers?” Even if this is answered in the affirmative, members say a better question would be whether such tools mark “an improvement over other possible reforms to the criminal justice system.”
The conclusion to PAI’s report recommends the use of “maximal caution and humility in the deployment of statistical tools” in the criminal justice arena. While there is no consensus among its members if risk assessments will one day be capable of making determinations where individual liberty is concerned, the membership is in complete agreement that the tools currently available are not in shape or form ready to handle such a task.
As a digital subscriber to Criminal Legal News, you can access full text and downloads for this and other premium content.
Already a subscriber? Login