AI extracts personal information from anonymous data

AI extracts personal information from anonymous data

Technology News |
By Rich Pell

Researchers at the Illinois Institute of Technology say they have extracted personal information – specifically protected characteristics like age and gender – from anonymous cell phone data using machine learning and artificial intelligence algorithms, raising questions about data security. Their work, say the researchers, suggests that powerful machine learning classification methods are capable of targeting individuals for personalized marketing purposes, even in the absence of personally-identifiable information (PII).

The researchers used data from a Latin American cell phone company to successfully estimate the gender and age of individual users through their private communications with relative ease. They developed a neural network model to estimate gender with 67 percent accuracy, which outperforms modern techniques such as decision tree, random forest, and gradient boosting models by a significant margin. They also were able to estimate the age of individual users with an accuracy rate of 78 percent by using the same model.

“Age and gender information does seem innocuous, but this information is used in nefarious ways by people, many times with devastating consequences,” says Matthew Shapiro, professor of political science. “When someone with bad intentions targets young children for anything, ranging from sales to sexual predation, it violates a number of laws designed to protect minors, such as the Children’s Online Privacy Protection Act and HIPAA. At the other end of the age spectrum, seniors are targeted by sophisticated spam and phishing efforts given their susceptibility and their access to savings.”

This information was extrapolated using commonly accessible computing equipment. The team used a Linux (Fedora) operating system with 16 GB memory and an Intel i5-6200U CPU with four cores to run the neural network model.

“The laptop we used for this work is not exclusive at all,” says Vijay K. Gurbani, research associate professor of computer science. “To a well-resourced adversary, there will be much more powerful machines available, including access to cluster computing, where multiple computers are configured in a cluster to provide the computer power for the AI/ML models.”

The data set used to conduct the research is not publicly available, but the researchers say an adversary could collect a similar data set by capturing data through public Wi-Fi hotspots or by attacking service providers’ computing infrastructure. The aim of their work, say the researchers, is to start a dialogue that critically examines the impact that emerging machine learning and AI techniques have on privacy regulations.

There are no nationwide privacy regulations in the United States, so the researchers looked at how these techniques chip away at the European Union’s General Data Protection Regulation articles, which are designed to protect consumers from the imminent threat of privacy violations.

“Machine learning and automated decision making will be a mainstream of business processes, and there is no escaping that reality,” says Gurbani. “The issue at hand is how to protect individual privacy as well as societal and economic interests from fraud using the appropriate regulatory framework.”

One way to do that, say the researchers, is to provide consumers with the “opt-out option” to keep their personal information private when installing an app. Recommendations include using synthetic data rather than user observation for machine learning models, for data holders to work with machine learning specialists to develop best practices, to build a regulatory framework that allows users to opt out of data sharing to keep personal information private, and to update existing non-compliance protocols.

In other words, say the researchers, there is more work to be done to address the policy gaps as well as the ethics of AI. For more, see “Predicting age and gender from network telemetry: Implications for privacy and impact on policy.”

If you enjoyed this article, you will like the following ones: don't miss them by subscribing to :    eeNews on Google News


Linked Articles