Legal knowledge extraction in the data protection domain based on Ontology Design Patterns

Leone, Valentina (2021) Legal knowledge extraction in the data protection domain based on Ontology Design Patterns, [Dissertation thesis], Alma Mater Studiorum Università di Bologna. Dottorato di ricerca in Law, science and technology, 33 Ciclo. DOI 10.48676/unibo/amsdottorato/9747.
Documenti full-text disponibili:
[img] Documento PDF (English) - Richiede un lettore di PDF come Xpdf o Adobe Acrobat Reader
Disponibile con Licenza: Creative Commons Attribution Non-commercial ShareAlike 4.0 (CC BY-NC-SA 4.0) .
Download (8MB)


In the European Union, the entry into force of the General Data Protection Regulation (GDPR) has brought the domain of data protection to the fore-front, encouraging the research in knowledge representation and natural language processing (NLP). On the one hand, several ontologies adopted Semantic Web standards to provide a formal representation of the data protection framework set by the GDPR. On the other hand, different NLP techniques have been utilised to implement services addressed to individuals, for helping them in understanding privacy policies, which are notoriously difficult to read. Few efforts have been devoted to the mapping of the information extracted from privacy policies to the conceptual representations provided by the existing ontologies modelling the data protection framework. In the first part of the thesis, I propose and put in the context of the Semantic Web a comparative analysis of existing ontologies that have been developed to model different legal fields. In the second part of the thesis, I focus on the data protection domain and I present a methodology that aims to fill the gap between the multitude of ontologies released to model the data protection framework and the disparate approaches proposed to automatically process the text of privacy policies. The methodology relies on the notion of Ontology Design Pattern (ODP), i.e. a modelling solution to solve a recurrent ontology design problem. Implementing a pipeline that exploits existing vocabularies and different NLP techniques, I show how the information disclosed in privacy policies could be extracted and modelled through some existing ODPs. The benefit of such an approach is the provision of a methodology for processing privacy policies texts that overlooks the different ontological models. Instead, it uses ODPs as a semantic middle-layer of processing that different ontological models could refine and extend according to their own ontological commitments.

Tipologia del documento
Tesi di dottorato
Leone, Valentina
Dottorato di ricerca
Settore disciplinare
Settore concorsuale
Parole chiave
ontology design patterns; legal knowledge representation; information extraction; data protection
Data di discussione
28 Maggio 2021

Altri metadati

Statistica sui download

Gestione del documento: Visualizza la tesi