Learning with Kernels on Graphs: DAG-based kernels, data streams and RNA function prediction.

Navarin, Nicolò (2014) Learning with Kernels on Graphs: DAG-based kernels, data streams and RNA function prediction., [Dissertation thesis], Alma Mater Studiorum Università di Bologna. Dottorato di ricerca in Informatica, 26 Ciclo. DOI 10.6092/unibo/amsdottorato/6578.
Documenti full-text disponibili:
Documento PDF (English) - Richiede un lettore di PDF come Xpdf o Adobe Acrobat Reader
Download (2MB) | Anteprima


In many application domains data can be naturally represented as graphs. When the application of analytical solutions for a given problem is unfeasible, machine learning techniques could be a viable way to solve the problem. Classical machine learning techniques are defined for data represented in a vectorial form. Recently some of them have been extended to deal directly with structured data. Among those techniques, kernel methods have shown promising results both from the computational complexity and the predictive performance point of view. Kernel methods allow to avoid an explicit mapping in a vectorial form relying on kernel functions, which informally are functions calculating a similarity measure between two entities. However, the definition of good kernels for graphs is a challenging problem because of the difficulty to find a good tradeoff between computational complexity and expressiveness. Another problem we face is learning on data streams, where a potentially unbounded sequence of data is generated by some sources. There are three main contributions in this thesis. The first contribution is the definition of a new family of kernels for graphs based on Directed Acyclic Graphs (DAGs). We analyzed two kernels from this family, achieving state-of-the-art results from both the computational and the classification point of view on real-world datasets. The second contribution consists in making the application of learning algorithms for streams of graphs feasible. Moreover,we defined a principled way for the memory management. The third contribution is the application of machine learning techniques for structured data to non-coding RNA function prediction. In this setting, the secondary structure is thought to carry relevant information. However, existing methods considering the secondary structure have prohibitively high computational complexity. We propose to apply kernel methods on this domain, obtaining state-of-the-art results.

Tipologia del documento
Tesi di dottorato
Navarin, Nicolò
Dottorato di ricerca
Scuola di dottorato
Scienze e ingegneria dell'informazione
Settore disciplinare
Settore concorsuale
Parole chiave
machine learning, kernel methods, graph kernels, online learning, bioinformatics
Data di discussione
19 Maggio 2014

Altri metadati

Statistica sui download

Gestione del documento: Visualizza la tesi