Want to Learn a New Language? It Might Cost you Your Data  

Published on
language apps data collection

Millions of people turn to language learning apps like Duolingo and Babble to learn a new language, but many remain unaware of how data-hungry these apps actually are. 

New research by the VPN provider Surfshark has shed light on the extent of sensitive data gathered by these language-learning apps by examining the data collection practices of ten of the most popular apps around the world. 

It found that among the most popular learning apps, Duolingo tops the charts as the most data-hungry, consuming an impressive 19 out of potential data points – nearly 60 per cent of all available user data. 

This data includes information like email addresses, phone numbers, search history, and contacts, as well as many other notable data points. 

Following Duolingo is the language app Busuuu, which collects 17 data points, and iHuman which gathers 12.

language apps data collection
Language apps data collection entry points. Source: Surfshark 

On the other end of the spectrum, some apps adopt a more privacy-centric approach. For example, EWA collects just 5 out of 32 data points, while HelloTalk Mondly captures 7 and 8 respectively.

Despite its conservative data collection, however, HelloTalk surprisingly tracks users' precise locations – a feature not found in any of the other analyzed apps.

"As the new school and work season begins, many people are eager to learn a new language. However, in the pursuit of linguistic skills, it's easy to overlook a crucial aspect: the data-hungry nature of popular language-learning apps,” said  Agnieszka Sablovskaja, Surfshark’s Lead Researcher.

“Knowing what type of data your beloved language learning app collects can lead to informed decisions regarding its usage," Sablovskaja added.

3rd-party data tracking

According to Surfshark, the problem with language learning apps not only lies in the volume of data collected by these apps but also in how that data is managed. 

Many of these apps use collected data to track users, which is often done by sharing user data with third-party advertisers or even data brokers. 9 out of 10 apps examined employ collected data for tracking purposes, with an average of 3 data points used for this purpose. 

Duolingo takes the lead in this category as well, emerging as the undisputed champion of tracking. It uses two-thirds of collected user data (13 out of 19 data points) for tracking purposes, which is 4 times the average among the analyzed apps. Some of the data points used for tracking include purchase history, location, and phone number.

The findings arrive in the wake of a recent data leak affecting over 2.6 million Duolingo users. Including 967,000 users in the US alone. 

The leak occurred by scraping Duolingo's data using an application programming interface (API), revealing a mix of both public and private information stored by the language app.



International Cyber Expo is back, and it's bigger, bolder, and more exciting than ever before! Get ready to immerse yourself in the latest innovations, cutting-edge technologies, and captivating experiences that await you 26 -27 September 2023 at Olympia London.

With over 175 exhibitors showcasing their groundbreaking solutions, this year's expo promises to be an incredible gathering of industry leaders, experts, and enthusiasts. Atendees can discover the future of cybersecurity and witness firsthand the latest advancements that are shaping our digital world.

Register now!

Join 34,209 IT professionals who already have a head start

Network with the biggest names in IT and gain instant access to all of our exclusive content for free.

Get Started Now