Skip navigation
CLN bookstore

FBI’s Bias for Keywords

by Carlos Difundo

In September of 2021, then-Assistant Director for Counterterrorism Jill Sanborn told the Senate that the FBI did not monitor publicly available social media conversations. “It’s not within our authorities,” she told them, adding that the First Amendment barred them from doing so. It turns out that statement was wrong, according to a report the Senate put together in June of 2023.

Prior to 2021, the FBI used a tool from Dataminr to scan social media. According to the company, the tool searched for a predetermined set of keywords. It did not perform any sort of link analysis, nor did it attempt to discover if the scanned accounts were related by location or group. They also added that their tool does not perform any kind of surveillance, apparently meaning it does not monitor specific accounts. That, however, seems to be a bit disingenuous. It is not difficult to find accounts where specific sets of keywords such as “#BlackLivesMatter” dominate their posts. A well-defined set of keywords can at least be highly selective even without including the account name. The FBI’s goal, they claim, “is not to ‘scrape’ or otherwise monitor individual social media activity” but that it “seeks to identify an immediate alerting capability to better enable the FBI to quickly respond to ongoing national security and public safety-related incidents.”

Keyword-based search tools can turn up some gems at times. However, they can also highlight the biases of the people who define keyword sets, as well as in how the agents poring through the data select “hits.” Eliminating false positives is difficult at any time as well, especially for FBI agents trying to cast a wide net. False positives had Dataminr alerting the U.S. Marshal’s office of peaceful abortion rights protests, jokes about Donald Trump’s weight, and criticisms of the Met Gala, according to The Intercept.

The FBI decided to move away from Dataminr and focus on a tool from ZeroFox. This was despite lamentations from agents and ZeroFox’ track record with the FBI. One FBI email closed by stating, “Dataminr is user friendly and does not require an expertise in social media exploitation.” Yet, without that expertise, the agents become more reliant on the tools they use to separate the wheat from the chaff. Keyword searches will always be inherently “noisy.” More importantly, it will always be a fairly simple process for bots to poison the results. Part of Google’s process in providing its search results is a strenuous link analysis to highlight bot-based “spamming” of its search engine, yet some bots still do break through the result list. If the FBI is doing raw keyword searches, bots will be able to send them on wild goose chases, especially if the predefined keywords are known.

Again, ZeroFox and the FBI have history. In 2015, ZeroFox took the bot-bait and decided DeRay McKesson and Johnette Elzie, Black Lives Matter protest leaders, needed “continuous monitoring.” After trolls impersonating Elzie claimed that she planned to attend and violently disrupt the 2015 Republican National Convention in Cleveland—despite her being in New Orleans at the time—the FBI paid her parents a visit to discourage her from attending. Since that event, ZeroFox claims to have improved its processes to include human analysis before the alert is forwarded to the client agency, but that still does not eliminate the problem. The workload is just shifted away from the agents who must take ownership of the task and introduces new opportunities for bias to creep in so that unforeseeable corporate and agency biases converge and multiply.   





The Habeas Citebook Ineffective Counsel Side
Advertise here
PLN Subscribe Now Ad