Linares Zaila, Yisleidy
(2017)
GEIR: a Full-Fledged Geographically Enhanced Information Retrieval Solution, [Dissertation thesis], Alma Mater Studiorum Università di Bologna.
Dottorato di ricerca in
Computer science and engineering, 29 Ciclo. DOI 10.6092/unibo/amsdottorato/8051.
Documenti full-text disponibili:
Abstract
With the development of search engines (e.g. Google, Bing, Yahoo, etc.), people is ambitiously expecting higher quality and improvements of current technologies. Bringing human intelligence features to these tools, like the ability to find implicit information through semantics, is one of the must prominent research lines in Computer Science. Information semantics is a very wide concept, as wide as the human capability to interpret, in particular, the analysis of geographical semantics gives the possibility to associate information with a place. It is estimated that more than 70\% of all information in the world has some kind of geographic features \cite{Jones04}. In 2012, Ed Parsons, a GeoSpatial Technologist from Google, reported that between 30\% and 40\% of the user queries at Google search engine contain geographic references \cite{Parsons12}.
This thesis addresses the field of geographic information extraction and retrieval in unstructured texts. This process includes the identification of spatial features in textual documents, the data indexing, the manipulation of the relevance of the identified geographic entities and the multi-criteria retrieval according to the thematic and geographic information.
The main contributions of this work include a custom geographic knowledge base, built from the combination of GeoNames and WordNet; a Natural Language Processing and knowledge based heuristics for Toponym Recognition and Toponym Disambiguation; and a geographic relevance weighting model that supports non-spatial indexing and simple ranking combination approaches. The validity of each one of these components is supported by practical experiments that show their effectiveness in different scenarios and their alignment with state of the art solutions.
In addition, it also constitutes a main contribution of this work GEIR, a general purpose GIR framework that includes the implementations of the above described components and brings the possibility of implementing new ones and test their performance within an end to end GIR system.
Abstract
With the development of search engines (e.g. Google, Bing, Yahoo, etc.), people is ambitiously expecting higher quality and improvements of current technologies. Bringing human intelligence features to these tools, like the ability to find implicit information through semantics, is one of the must prominent research lines in Computer Science. Information semantics is a very wide concept, as wide as the human capability to interpret, in particular, the analysis of geographical semantics gives the possibility to associate information with a place. It is estimated that more than 70\% of all information in the world has some kind of geographic features \cite{Jones04}. In 2012, Ed Parsons, a GeoSpatial Technologist from Google, reported that between 30\% and 40\% of the user queries at Google search engine contain geographic references \cite{Parsons12}.
This thesis addresses the field of geographic information extraction and retrieval in unstructured texts. This process includes the identification of spatial features in textual documents, the data indexing, the manipulation of the relevance of the identified geographic entities and the multi-criteria retrieval according to the thematic and geographic information.
The main contributions of this work include a custom geographic knowledge base, built from the combination of GeoNames and WordNet; a Natural Language Processing and knowledge based heuristics for Toponym Recognition and Toponym Disambiguation; and a geographic relevance weighting model that supports non-spatial indexing and simple ranking combination approaches. The validity of each one of these components is supported by practical experiments that show their effectiveness in different scenarios and their alignment with state of the art solutions.
In addition, it also constitutes a main contribution of this work GEIR, a general purpose GIR framework that includes the implementations of the above described components and brings the possibility of implementing new ones and test their performance within an end to end GIR system.
Tipologia del documento
Tesi di dottorato
Autore
Linares Zaila, Yisleidy
Supervisore
Dottorato di ricerca
Ciclo
29
Coordinatore
Settore disciplinare
Settore concorsuale
Parole chiave
Geographic Information Retrieval
Toponym resolution
Geographic knowledge bases
GIR Framework
Geographic weighting models
URN:NBN
DOI
10.6092/unibo/amsdottorato/8051
Data di discussione
15 Maggio 2017
URI
Altri metadati
Tipologia del documento
Tesi di dottorato
Autore
Linares Zaila, Yisleidy
Supervisore
Dottorato di ricerca
Ciclo
29
Coordinatore
Settore disciplinare
Settore concorsuale
Parole chiave
Geographic Information Retrieval
Toponym resolution
Geographic knowledge bases
GIR Framework
Geographic weighting models
URN:NBN
DOI
10.6092/unibo/amsdottorato/8051
Data di discussione
15 Maggio 2017
URI
Statistica sui download
Gestione del documento: