Nanda, Rohan
(2019)
Automated Identification of National Implementations of European Union Directives with Multilingual Information
Retrieval based on Semantic Textual Similarity, [Dissertation thesis], Alma Mater Studiorum Università di Bologna.
Dottorato di ricerca in
Law, science and technology, 31 Ciclo. DOI 10.6092/unibo/amsdottorato/8978.
Documenti full-text disponibili:
|
Documento PDF (English)
- Richiede un lettore di PDF come Xpdf o Adobe Acrobat Reader
Disponibile con Licenza: Salvo eventuali più ampie autorizzazioni dell'autore, la tesi può essere liberamente consultata e può essere effettuato il salvataggio e la stampa di una copia per fini strettamente personali di studio, di ricerca e di insegnamento, con espresso divieto di qualunque utilizzo direttamente o indirettamente commerciale. Ogni altro diritto sul materiale è riservato.
Download (2MB)
|
Abstract
The effective transposition of European Union (EU) directives into Member States is important to achieve the policy goals defined in the Treaties and secondary legislation. National Implementing Measures (NIMs) are the legal texts officially adopted by the Member States to transpose the provisions of an EU directive. The measures undertaken by the Commission to monitor NIMs are time-consuming and expensive, as they resort to manual conformity checking studies and legal analysis. In this thesis, we developed a legal information retrieval system using semantic textual similarity techniques to automatically identify the transposition of EU directives into the national law at a fine-grained provision level. We modeled and developed various text similarity approaches such as lexical, semantic, knowledge-based, embeddings-based and concept-based methods. The text similarity systems utilized both textual features (tokens, N-grams, topic models, word and paragraph embeddings) and semantic knowledge from external knowledge bases (EuroVoc, IATE and Babelfy) to identify transpositions. This thesis work also involved the development of a multilingual corpus of 43 directives and their corresponding NIMs from Ireland (English legislation), Italy (Italian legislation) and Luxembourg (French legislation) to validate the text similarity based information retrieval system. A gold standard mapping (prepared by two legal researchers) between directive articles and NIM provisions was prepared to evaluate the various text similarity models. The results show that the lexical and semantic text similarity techniques were more effective in identifying transpositions as compared to the embeddings-based techniques. We also observed that the unsupervised text similarity techniques had the best performance in case of the Luxembourg Directive-NIM corpus.
Abstract
The effective transposition of European Union (EU) directives into Member States is important to achieve the policy goals defined in the Treaties and secondary legislation. National Implementing Measures (NIMs) are the legal texts officially adopted by the Member States to transpose the provisions of an EU directive. The measures undertaken by the Commission to monitor NIMs are time-consuming and expensive, as they resort to manual conformity checking studies and legal analysis. In this thesis, we developed a legal information retrieval system using semantic textual similarity techniques to automatically identify the transposition of EU directives into the national law at a fine-grained provision level. We modeled and developed various text similarity approaches such as lexical, semantic, knowledge-based, embeddings-based and concept-based methods. The text similarity systems utilized both textual features (tokens, N-grams, topic models, word and paragraph embeddings) and semantic knowledge from external knowledge bases (EuroVoc, IATE and Babelfy) to identify transpositions. This thesis work also involved the development of a multilingual corpus of 43 directives and their corresponding NIMs from Ireland (English legislation), Italy (Italian legislation) and Luxembourg (French legislation) to validate the text similarity based information retrieval system. A gold standard mapping (prepared by two legal researchers) between directive articles and NIM provisions was prepared to evaluate the various text similarity models. The results show that the lexical and semantic text similarity techniques were more effective in identifying transpositions as compared to the embeddings-based techniques. We also observed that the unsupervised text similarity techniques had the best performance in case of the Luxembourg Directive-NIM corpus.
Tipologia del documento
Tesi di dottorato
Autore
Nanda, Rohan
Supervisore
Co-supervisore
Dottorato di ricerca
Ciclo
31
Coordinatore
Settore disciplinare
Settore concorsuale
Parole chiave
Text similarity, Transposition, European Law, Machine Learning, Concept Recognition
URN:NBN
DOI
10.6092/unibo/amsdottorato/8978
Data di discussione
28 Marzo 2019
URI
Altri metadati
Tipologia del documento
Tesi di dottorato
Autore
Nanda, Rohan
Supervisore
Co-supervisore
Dottorato di ricerca
Ciclo
31
Coordinatore
Settore disciplinare
Settore concorsuale
Parole chiave
Text similarity, Transposition, European Law, Machine Learning, Concept Recognition
URN:NBN
DOI
10.6092/unibo/amsdottorato/8978
Data di discussione
28 Marzo 2019
URI
Statistica sui download
Gestione del documento: