Simplifying Data Protection with Artificial Intelligence

Compliance with the EU’s General Data Protection Regulation (GDPR) has become a source of concern for many companies. In fact, due to their nature and sensitivity, data protection policies need to be adapted – to a very high level of detail – in each individual case. However, being written in natural language, they often reflect some of its problems, such as ambiguity, incompleteness, and inconsistency. 

Checking policy compliance is a complex, time-consuming, and labour-intensive task, in which even the tiniest mistake can have far-reaching consequences. For businesses, that includes fines in the order of millions of euros. As companies worldwide struggle with GDPR compliance, there is an important need for cost-effective methods that can help them manage it more efficiently. 

The answer may lie in automation and artificial intelligence. Just as these technologies have revolutionised other industries over the last decade, their application to the data privacy economy could be the breakthrough companies need to stay on top of their data privacy compliance obligations, even as these obligations change and grow. The challenges, however, are significant. Automations based on natural language require particularly sophisticated artificial intelligence. Most automations currently applied to GDPR compliance, however, are limited to simple keyword searches and make no attempt to understand or interpret the meaning behind the document being analysed. This can result in serious oversights and doesn’t relieve the human end users of more than the most precursory work. And while breakthroughs in an AI technique called natural language processing are an excellent foundation for improving these tools, they are not yet up to the task of reliably automating data privacy policy compliance.  

“It is inspiring to be working on solutions that will one day help to protect the privacy of millions of users” 

Team members Orlando Amaral Cejas, Sallam Abualhaija and Angelo Rizzi

A cooperation between SnT and global law firm Linklaters, launched in 2020 with the support of the Luxembourg National Research Fund (FNR), is pushing the envelope in natural language processing research to help bring sophisticated data privacy compliance automation tools to an industry that desperately needs them by making AI even smarter. 

“We are proud to be at the forefront of this initiative together with SnT, and to take this step towards integrating AI in the future of legal practice.”

The industry-driven, FNR-funded project, entitled “Artificial Intelligence-enables Automation for GDPR Compliance” (ARTAGO) is conducted under the supervision of SnT professor Lionel Briand, head of the Security Verification and Validation research group at SnT, alongside Dr. Sallam Abualhaija, research scientist. The rest of the ARTAGO team comprises Orlando Amaral Cejas, Angelo Rizzi, and Dr. Muhammed Ilyas Azeem. In addition, a team from Linklaters contributed their legal expertise, namely Katrien Baetens, Sylvie Forastier, and Catherine Freichel. Together they developed a brandnew tool, named CompAI, which checks the completeness of privacy policies in light of the GDPR. The tool works by parsing the content of the privacy policy, analysing the meaning of each piece of text, and sorting the information into categories. CompAI takes the results of its analysis, combined with additional information provided by the user through a questionnaire, and verifies the proposed privacy policy content against 23 GDPR requirements. The tool, which supports the analysis of documents in .doc, .docx and PDF formats, produces a report listing the criteria that were satisfied, violated, and in need of correction, with a precision of 92.9%. 

The team have developed a tool that simplifies GDPR compliance

“The solution we developed uses artificial intelligence and, in particular, a combination of natural language processing and machine learning. This means that we don’t represent words as textual entities, but use word embeddings. These are mathematical vectors that are generated with deep learning to represent syntactical and semantical characteristics of the sentence. This process provides the tool with a certain ‘understanding’ of the text,” says Dr. Abualhaija. It is this understanding that makes the Comp AI tool really unique.


While CompAI is an impressive and much needed addition to the toolbox of any company concerned about GDPR compliance, it is just the first of many tools the ARTAGO project has set out to develop. “We are now working on data processing agreements, which represent the next stage of our project,” said Abualhajia, the ARTAGO Project Lead. “It is inspiring to be working on solutions that will one day help to protect the privacy of millions of users,” she added. “We are proud to be at the forefront of this initiative together with SnT,” said Patrick Geortay, managing partner, Linklaters LLP Luxembourg,” and to take this step towards integrating AI in the future of legal practice.” 

People & Partners in this Project​

Lionel Briand
Sallam Abualhaija
Sallam Abualhaija
Angelo Rizzi
Orlando Amaral Cejas
Muhammed Ilyas Azeem