Profiti, Giuseppe
(2015)
Graph algorithms for bioinformatics, [Dissertation thesis], Alma Mater Studiorum Università di Bologna.
Dottorato di ricerca in
Informatica, 27 Ciclo. DOI 10.6092/unibo/amsdottorato/6914.
Documenti full-text disponibili:
Abstract
Biological data are inherently interconnected: protein sequences are connected to their annotations, the annotations are structured into ontologies, and so on. While protein-protein interactions are already represented by graphs, in this work I am presenting how a graph structure can be used to enrich the annotation of protein sequences thanks to algorithms that analyze the graph topology. We also describe a novel solution to restrict the data generation needed for building such a graph, thanks to constraints on the data and dynamic programming. The proposed algorithm ideally improves the generation time by a factor of 5. The graph representation is then exploited to build a comprehensive database, thanks to the rising technology of graph databases. While graph databases are widely used for other kind of data, from Twitter tweets to recommendation systems, their application to bioinformatics is new. A graph database is proposed, with a structure that can be easily expanded and queried.
Abstract
Biological data are inherently interconnected: protein sequences are connected to their annotations, the annotations are structured into ontologies, and so on. While protein-protein interactions are already represented by graphs, in this work I am presenting how a graph structure can be used to enrich the annotation of protein sequences thanks to algorithms that analyze the graph topology. We also describe a novel solution to restrict the data generation needed for building such a graph, thanks to constraints on the data and dynamic programming. The proposed algorithm ideally improves the generation time by a factor of 5. The graph representation is then exploited to build a comprehensive database, thanks to the rising technology of graph databases. While graph databases are widely used for other kind of data, from Twitter tweets to recommendation systems, their application to bioinformatics is new. A graph database is proposed, with a structure that can be easily expanded and queried.
Tipologia del documento
Tesi di dottorato
Autore
Profiti, Giuseppe
Supervisore
Dottorato di ricerca
Scuola di dottorato
Scienze e ingegneria dell'informazione
Ciclo
27
Coordinatore
Settore disciplinare
Settore concorsuale
Parole chiave
graph database, protein sequence annotation, community detection
URN:NBN
DOI
10.6092/unibo/amsdottorato/6914
Data di discussione
4 Giugno 2015
URI
Altri metadati
Tipologia del documento
Tesi di dottorato
Autore
Profiti, Giuseppe
Supervisore
Dottorato di ricerca
Scuola di dottorato
Scienze e ingegneria dell'informazione
Ciclo
27
Coordinatore
Settore disciplinare
Settore concorsuale
Parole chiave
graph database, protein sequence annotation, community detection
URN:NBN
DOI
10.6092/unibo/amsdottorato/6914
Data di discussione
4 Giugno 2015
URI
Statistica sui download
Gestione del documento: