Automated Identification of National Implementations of European Union Directives with Multilingual Information Retrieval based on Semantic Textual Similarity

Nanda, Rohan (2019) Automated Identification of National Implementations of European Union Directives with Multilingual Information Retrieval based on Semantic Textual Similarity, [Dissertation thesis], Alma Mater Studiorum Università di Bologna. Dottorato di ricerca in Law, science and technology, 31 Ciclo. DOI 10.6092/unibo/amsdottorato/8978.

Salva citazione

Citato da

Documenti full-text disponibili:

Documento PDF (English) - Richiede un lettore di PDF come Xpdf o Adobe Acrobat Reader
Disponibile con Licenza: Salvo eventuali più ampie autorizzazioni dell'autore, la tesi può essere liberamente consultata e può essere effettuato il salvataggio e la stampa di una copia per fini strettamente personali di studio, di ricerca e di insegnamento, con espresso divieto di qualunque utilizzo direttamente o indirettamente commerciale. Ogni altro diritto sul materiale è riservato.
Download (2MB)

Abstract

The effective transposition of European Union (EU) directives into Member States is important to achieve the policy goals defined in the Treaties and secondary legislation. National Implementing Measures (NIMs) are the legal texts officially adopted by the Member States to transpose the provisions of an EU directive. The measures undertaken by the Commission to monitor NIMs are time-consuming and expensive, as they resort to manual conformity checking studies and legal analysis. In this thesis, we developed a legal information retrieval system using semantic textual similarity techniques to automatically identify the transposition of EU directives into the national law at a fine-grained provision level. We modeled and developed various text similarity approaches such as lexical, semantic, knowledge-based, embeddings-based and concept-based methods. The text similarity systems utilized both textual features (tokens, N-grams, topic models, word and paragraph embeddings) and semantic knowledge from external knowledge bases (EuroVoc, IATE and Babelfy) to identify transpositions. This thesis work also involved the development of a multilingual corpus of 43 directives and their corresponding NIMs from Ireland (English legislation), Italy (Italian legislation) and Luxembourg (French legislation) to validate the text similarity based information retrieval system. A gold standard mapping (prepared by two legal researchers) between directive articles and NIM provisions was prepared to evaluate the various text similarity models. The results show that the lexical and semantic text similarity techniques were more effective in identifying transpositions as compared to the embeddings-based techniques. We also observed that the unsupervised text similarity techniques had the best performance in case of the Luxembourg Directive-NIM corpus.

Abstract

Tipologia del documento

Tesi di dottorato

Autore

Nanda, Rohan

Supervisore

Boella, Guido ; Van Der Torre, Leon ; Palmirani, Monica

Co-supervisore

Di Caro, Luigi

Dottorato di ricerca

Law, science and technology

Ciclo