5 Data Privacy Risks with Generative AI

Published on
5 Data Privacy Risks with Generative AI

As AI tools like ChatGPT and Dall-e, eBay's Image Processing feature, DeepMind's Alphafold, and AmazonGo's innovations become increasingly prevalent, AI's exponential growth raises the question: will this expansion be a looming data protection challenge or an opportunity to prioritise data protection?

Business leaders have been quick to implement generative AI as a tool to improve their efficiency and relieve workloads. But with several countries debating generative AI’s safety, and Italy even going as far as to temporarily ban it, companies should look at these data privacy risks to ensure they remain compliant. 

Here are 5 key data privacy risks associated with generative AI.

Specified purpose

Article 5(1)(b) of the GDPR states that personal data shall be “collected for specified, explicit and legitimate purposes and not further processed in a manner that is incompatible with those purposes”.


This raises a concern as AI tools and language models use data for any purpose.

Expanding data set

The ICO guidance on data minimisation states that you should “only process the personal data you need for your purpose” - but how does a vast and expanding data set comply with this principle?

Data quality and accuracy

Concerns around the quality and accuracy of data pulled by generative AI tools have been called into question multiple times during large-scale blunders, including when a US lawyer was caught citing fake cases that the AI had generated in court. 


While mistakes are inevitable, Google’s Bard has prioritised correcting data within their vast data set - working to make sure the AI can unlearn mistakes and relearn answers in a live environment. 

Security and confidentiality

Another issue with the quantity of data used for AI tools is security. Does a bigger set of data make it more vulnerable to attack? If it was attacked, would it be harder to discern what had been affected?


If the information was publicly available on the internet, then theoretically there would be no value in the data from a ransomware angle. That being said, data could still be maliciously skewed or manipulated. 

AI must make fair assumptions

According to the ICO’s guidance on AI and data protection, you must ensure “the system is sufficiently statistically accurate and avoids discrimination; and you consider the impact of individuals’ reasonable expectations.”


The potential for AI tools to be discriminatory and even racist is a key concern for developers - as AI systems trained on data that reflects human biases or historical inequalities will learn and implement those same patterns.