Scientific description of the studies

We present our innovative GDPR (General Data Protection Regulation) compliance auditing technology for mobile applications, based on research framed within the AutoGDPR project funded by the State Research Agency. This technology allows the analysis of the behavior of Android mobile applications, as well as the automated processing of privacy policies, using techniques based on Artificial Intelligence (AI). Below are the scientific articles describing the techniques used to assess compliance with three aspects of the GDPR, focused on the disclosure in the privacy policy of the following points:

Method for identifying the data controller of an application

In the article ROI: a method for identifying organizations receiving personal data, the ROI method is developed and described, which identifies the organization responsible for data processing of a domain. This method combines advanced techniques to accurately (95.71%) identify organizations receiving personal data. We evaluated 10,000 Android applications, revealing that almost 78% of the applications are not transparent about their data-sharing practices.

A key feature of ROI is its ability to identify the data controller in privacy policies. The method analyzes these policies using natural language processing (NLP) and named entity recognition (NER) techniques, which allow extracting the identity of the data controller. In simple terms, these techniques serve to read and understand the text of the privacy policy, automatically identifying the organization responsible for the personal data. This process has been rigorously validated, achieving a 93.34% accuracy in identifying the data controller.

Method for identifying third entities receiving personal data

In the article Sharing is Not Always Caring: Delving Into Personal Data Transfer Compliance in Android Apps, the transparency of personal data transfer practices in Android applications is analyzed. We applied an automated method to capture data-sharing practices and assess their adequate disclosure according to the GDPR.

The method was validated with an annotated dataset, reporting an F1 metric between 0.88 and 0.93. It uses the following techniques:

The study applied these techniques to 9,000 Android applications, revealing that more than 80% of the applications that transfer personal data off-device do not comply with GDPR transparency requirements. Additionally, it was found that 73.68% of undisclosed data transfers were initiated by third-party libraries, with Google, Unity, and Meta being the main recipients of these data.

Method for identifying international transfers

In the article Automated GDPR compliance assessment for cross-border personal data transfers in android applications, an automated method is developed to assess the compliance of Android applications with GDPR requirements for international personal data transfers. This method was applied to 4,593 applications from the Google Play Store, discovering that nearly half of the applications sending personal data potentially do not comply with GDPR requirements.

The method combines different techniques:

These classifiers were evaluated using a subset of the dataset (not used during training). The effectiveness of each classifier was measured, with F1 metrics ranging from 85.7% to 100%.