AI and GDPR: A Data Privacy Nightmare

As AI reaches its tipping point with the explosion of ChatGPT, data protection experts warn that the technology could pose a serious threat to data privacy and the future of GDPR.

Generative AI chatbots have taken the world by storm over the last few months, with OpenAI’s ChatGPT alone reaching 100 million users since its launch, making it the fastest-growing consumer application ever launched.

Users are attracted to the AI-wired tool’s impressive use cases – from its ability to accurately and directly respond to almost any question or query, to its construction of articles and essays on almost any topic in seconds.

While many praise the technology’s advanced capabilities, the rapidly advancing software is under serious scrutiny from data privacy experts, who worry AI technologies can raise ethical and legal concerns relating to the violation of data privacy laws.

AI systems systematically scrape hundreds of billions of words from the internet – whether this be from books, articles, websites or social media posts – including personal information obtained without consent.

According to the experts, this is an evident violation of privacy, since users cannot control or delete data that has been used to train the model.

“None of us were asked whether OpenAI could use our data,” Uri Gal, Professor of Business Information Systems at the University of Sydney, explained in The Conversation.

“This is a clear violation of privacy, especially when data are sensitive and can be used to identify us, our family members, or our location,” he added.

Several large corporations, such as JP Morgan Chase, Amazon and Accenture have already restricted the use of AI chatbot ChatGPT over data privacy concerns and the security of critical company assets.

Users share experts' concerns. In Cisco’s 2022’s Consumer Privacy Survey, 60 per cent of consumers expressed their concerns about how organisations apply and use data in AI today. 65 per cent said they have already lost trust in organisations over their AI practices.

A concern for GDPR regulators

The method that generative AI developers collect the data their chatbots are based on is yet to be publicly disclosed, but experts warn the practice of simply trawling the internet for training data alone goes against legal regulations.

In the EU, scraping data points from sites can be a breach of GDPR (and UK GDPR) laws, as well as the ePrivacy directive, and the EU Charter of fundamental rights.

With the GDPR we are building a European sovereignty on data. We have to do the same for the cloud, for AI, for innovation at large. #VivaTech

— Emmanuel Macron (@EmmanuelMacron) May 24, 2018

A recent example of this was Clearview AI, which used images scraped from the web to build its facial recognition software and was subsequently slapped with enforcement notices by data protection regulators at the end of last year.

Under GDPR law, the public also has the right to request that their personal data is removed from an organisation’s records entirely, through what is known as the “right to erasure.”

The problem with AI tools like ChatGPT and Microsoft’s Bing Chat is that they are based on language models made up of an array of personal and non-personal data, making it impossible to extract a single individual’s data from the model.

Language models also take and store data from a specific point in time, meaning that information that is inaccurate or misleading may be regurgitated in AI chatbot’s responses despite being deleted from the internet.

This may break another key GDPR law – "the right to be forgotten" – which allows people to delete their data from the internet if he or she wishes.

It remains unclear if generative AI technologies officially break GDPR laws, but with the technology at its tipping point, experts believe legal action should be expected soon.

Are the GDPR and the DSA incompatible?

(Like the regulation mandating all AI decisions to be explainable and the rule making Automatic Emergency Braking Systems mandatory in all cars sold in the EU) https://t.co/YLIpduDnTO

— Yann LeCun (@ylecun) January 19, 2023

“GDPR would apply to all new AI technologies and platforms like ChatGPT, even if they were developed outside its territory. This is because it applies to any incidents that impact EU citizens,” Felipe Henao Brand, Senior Product Manager at Talend told the publication.

“AI-driven innovation is closely linked to the Data Governance Act, the Open Data Directive and other initiatives under the EU strategy for data, which will establish trusted mechanisms and services for the re-use, sharing and pooling of data that are essential for the development of data-driven AI models of high quality,” he added.

AI’s Copyright Chaos

It’s not just data privacy concerns that have put generative AI in legal hot water with regulators over the past few months.

The chatbot has also recently come under fire for multiple cases of copyright infringement due to its method of scraping data without the user’s consent.

In many jurisdictions, using information without the owner's consent is permitted under certain circumstances, including news reporting, quoting, teaching, satire or research purposes.

While AI developers like OpenAI may be able to use this argument to defend their non-consenting collection of data, problems arise when they monetise their products.

In January, Getty Images filed a lawsuit against AI Art Generator Stability AI for allegedly stealing millions of copyright-protected images to train its AI image generator Stable Diffusion.

Artists are standing up for their work. Getty Images is also doing a lawsuit.

Most artists are not against AI, but they are against having their work used without their permission & used for profit. It's about respect. Respect art and the artists. ❤️ https://t.co/fUyOS81UpC

— Agnes Garbowska (@AgnesGarbowska) January 17, 2023

OpenAI, the developers of ChatGPT recently announced ChatGPT Plus, a paid subscription plan that will give customers ongoing access to the tool, faster response times and priority access to new features.

This plan will contribute to an expected revenue of $1 billion by 2024. While AI technologies are new and the law has not been created, experts say that companies using AI-generated images for commercial purposes may be putting themselves at legal risk.

“Right now the legal minefield is still not packed with mines because legal has a tendency to follow after technological disruption. But the minefield is there, and it’s real,” Jonathan Løw, the co-founder of JumpStory, told SiliconRepublic.