Adebayo, Kolawole John
(2018)
Multimodal Legal Information Retrieval, [Dissertation thesis], Alma Mater Studiorum Università di Bologna.
Dottorato di ricerca in
Law, science and technology, 30 Ciclo. DOI 10.6092/unibo/amsdottorato/8634.
Documenti full-text disponibili:
Anteprima |
|
Documento PDF (English)
- Richiede un lettore di PDF come Xpdf o Adobe Acrobat Reader
Disponibile con Licenza: Salvo eventuali più ampie autorizzazioni dell'autore, la tesi può essere liberamente consultata e può essere effettuato il salvataggio e la stampa di una copia per fini strettamente personali di studio, di ricerca e di insegnamento, con espresso divieto di qualunque utilizzo direttamente o indirettamente commerciale. Ogni altro diritto sul materiale è riservato.
Download (2MB)
| Anteprima
|
Abstract
The goal of this thesis is to present a multifaceted way of inducing semantic representation from legal documents as well as accessing information in a precise and timely
manner. The thesis explored approaches for semantic information retrieval (IR) in the
Legal context with a technique that maps specific parts of a text to the relevant concept. This technique relies on text segments, using the Latent Dirichlet Allocation (LDA),
a topic modeling algorithm for performing text segmentation, expanding the concept
using some Natural Language Processing techniques, and then associating the text segments to the concepts using a semi-supervised text similarity technique. This solves
two problems, i.e., that of user specificity in formulating query, and information overload, for querying a large document collection with a set of concepts is more fine-grained
since specific information, rather than full documents is retrieved. The second part of the
thesis describes our Neural Network Relevance Model for E-Discovery Information Retrieval. Our algorithm is essentially a feature-rich Ensemble system with different component Neural Networks extracting different relevance signal. This model has been trained
and evaluated on the TREC Legal track 2010 data. The performance of our models across
board proves that it capture the semantics and relatedness between query and document
which is important to the Legal Information Retrieval domain.
Abstract
The goal of this thesis is to present a multifaceted way of inducing semantic representation from legal documents as well as accessing information in a precise and timely
manner. The thesis explored approaches for semantic information retrieval (IR) in the
Legal context with a technique that maps specific parts of a text to the relevant concept. This technique relies on text segments, using the Latent Dirichlet Allocation (LDA),
a topic modeling algorithm for performing text segmentation, expanding the concept
using some Natural Language Processing techniques, and then associating the text segments to the concepts using a semi-supervised text similarity technique. This solves
two problems, i.e., that of user specificity in formulating query, and information overload, for querying a large document collection with a set of concepts is more fine-grained
since specific information, rather than full documents is retrieved. The second part of the
thesis describes our Neural Network Relevance Model for E-Discovery Information Retrieval. Our algorithm is essentially a feature-rich Ensemble system with different component Neural Networks extracting different relevance signal. This model has been trained
and evaluated on the TREC Legal track 2010 data. The performance of our models across
board proves that it capture the semantics and relatedness between query and document
which is important to the Legal Information Retrieval domain.
Tipologia del documento
Tesi di dottorato
Autore
Adebayo, Kolawole John
Supervisore
Co-supervisore
Dottorato di ricerca
Ciclo
30
Coordinatore
Settore disciplinare
Settore concorsuale
Parole chiave
CNN,Concept-based IR, E-Discovery,
Eurovoc, EurLex, Information Retrieval,
Semantic Annotation, Semantic Similarity, LDA, LSTM, NLP, Neural-Networks, Text-Segmentation, Topic-Modeling
URN:NBN
DOI
10.6092/unibo/amsdottorato/8634
Data di discussione
27 Aprile 2018
URI
Altri metadati
Tipologia del documento
Tesi di dottorato
Autore
Adebayo, Kolawole John
Supervisore
Co-supervisore
Dottorato di ricerca
Ciclo
30
Coordinatore
Settore disciplinare
Settore concorsuale
Parole chiave
CNN,Concept-based IR, E-Discovery,
Eurovoc, EurLex, Information Retrieval,
Semantic Annotation, Semantic Similarity, LDA, LSTM, NLP, Neural-Networks, Text-Segmentation, Topic-Modeling
URN:NBN
DOI
10.6092/unibo/amsdottorato/8634
Data di discussione
27 Aprile 2018
URI
Statistica sui download
Gestione del documento: