Skip navigation
The Habeas Citebook: Prosecutorial Misconduct - Header
× You have 2 more free articles available this month. Subscribe today.

The Potential Privacy Threat of Generative AI

by Michael Dean Thompson

Police are increasingly looking to corporations for help solving crimes. Google is accustomed to providing lists of users who happened to be physically near a crime. In addition, it has provided the identities of people who had misfortune to search for certain keywords to law enforcement.

Google actively searches its customers’ files and emails to feed its ad generation machine. If that search happens on material that may be of interest to the law enforcement, the material and its associated user are turned over as well. But Google isn’t the only tech company engaged in such activity. GM’s OnStar has not only been used to help solve crimes via geofence requests, it has actively participated in the capture of suspects. Uber, Lyft, Microsoft, and Apple are not immune to police requests either.

Meanwhile, corporations collect ever more data about their customers. Search histories, home and work addresses, entertainment preferences, the ads that were clicked through, driving habits, social media content, and more are all collected and stored by companies that would seem uninterested in that data. It may not seem obvious that a car manufacturer would download an automobile occupant’s texts and social media, yet many do. These collection activities expose billions of users to potential cyber-crimes, identity theft, and persistent surveillance.

Every new tool these corporations develop adds to the quality and quantity of data about the user so that the companies can build a better picture of who the person is. A search history alone can develop an outline of one’s interests, but it is far from complete. That’s where generative AI comes into play. Generative AI tools are capable of maintaining a complete conversation that develops a far more detailed picture. The conversation shows not just what interests the user but also how that user’s mind worked to create the conversation, including word choice, word frequency, and other measures used in textual criticism to identify authors. Furthermore, people in natural conversations are more likely to share personal data.

Each one of the generative AI products tracks how individual users use the tool and may access their precision location information. Microsoft Copilot states that it will collect log data (e.g., data that is generated through typical web traffic). A user interacting through the web, as they likely would with Bing or Google, should expect that of any site or app. The products will track any purchases made, files uploaded (for image generation, for example), regenerations, and the types of tasks performed.

OpenAI, Google, and Microsoft allow users to control some of the information that is kept. Google’s Bard maintains copies of conversations for a user configurable 3 to 36 months, with a default of 18. While Google does appear to allow users to delete conversations, it should be noted that deleting Google’s location history has not proven to be a successful strategy due to a lag between when the deletion is requested and it actually occurs. Google, at least, recommends that the user not include personal data in the conversations. OpenAI’s ChatGPT product allows users to turn off chat history, though it does say a copy will be kept for 30 days. Among the reasons given is to watch for abuse, implying that the 30-day copy is attached to a user account.

Not long after Google appeared, people began “googling” themselves. In the beginning, most of the information returned was fairly inane. Today, googling a name can return current and ancient phone numbers, addresses, criminal histories, and more. The things Google’s search engine returns can derail job prospects and make identity theft much more durable. Today, if Bard is queried with “Tell me about …” the typical person on the street, it declines with the claim it does not know enough about that person. That bears the ring of a programmatic response as it almost certainly knows at least as much about that person as the search engine.

The question becomes, then, how long will it be before police begin demanding the ability to ask these questions. Would Google or Microsoft demand a warrant to tell the police the content of the conversations? Would a warrant be required for police to Bard you? What happens when contractors get programmer access to generative AI, are they able to ask these questions? These and many other questions must be asked and considered with many of them destined for the courts to address.   


Sources:, Google Bard 

As a digital subscriber to Criminal Legal News, you can access full text and downloads for this and other premium content.

Subscribe today

Already a subscriber? Login



CLN Subscribe Now Ad 450x600
Advertise Here 3rd Ad
Prisoner Education Guide side