Advanced statistical and machine learning techniques for multi-omics data integration

Derus, Nicolas Riccardo (2025) Advanced statistical and machine learning techniques for multi-omics data integration, [Dissertation thesis], Alma Mater Studiorum Università di Bologna. Dottorato di ricerca in Scienze e tecnologie della salute, 37 Ciclo.
Documenti full-text disponibili:
[thumbnail of derus_nicolas_tesi.pdf] Documento PDF (English) - Accesso riservato fino a 31 Dicembre 2026 - Richiede un lettore di PDF come Xpdf o Adobe Acrobat Reader
Disponibile con Licenza: Creative Commons: Attribuzione 4.0 (CC BY 4.0) .
Download (3MB) | Contatta l'autore

Abstract

Multi-omics integration promises to revolutionize precision medicine by combining diverse molecular data types to comprehensively characterize disease. However, these datasets present significant challenges due to their high dimensionality and inherent heterogeneity across both patient populations and molecular information, leading to biased predictions that fail to capture true biological signals. This dissertation develops a framework for heterogeneity-aware multi-omics integration that explicitly models these sources of variability to improve both predictive accuracy and biological interpretability. Our approach centers on a shared frailty survival model using multitask learning to discover latent patient subgroups and provide heterogeneity-adjusted survival estimates. Moreover, we provide an analytical framework based on mutual information decomposition that reveals how different fusion architectures preserve information across modalities, demonstrating that certain integration approaches better capture cross-modal relationships. To ensure robust evaluation, we develop a comprehensive benchmarking platform that incorporates realistic heterogeneous conditions through both real-world datasets and controlled simulations. We validate our methods through application to the myelodysplastic syndrome–acute myeloid leukemia continuum, where our heterogeneity-aware approach identifies previously hidden patient subgroups with distinct molecular mechanisms and treatment responses. This work demonstrates that accounting for population heterogeneity is essential for accurate prediction and biological interpretation in multi-omics analysis, providing both analytical foundations and practical tools that enable the translation of complex molecular data into actionable clinical insights.

Abstract
Tipologia del documento
Tesi di dottorato
Autore
Derus, Nicolas Riccardo
Supervisore
Dottorato di ricerca
Ciclo
37
Coordinatore
Settore disciplinare
Settore concorsuale
Parole chiave
Multi-omics; Data integration; Survival analysis; Machine learning; Frailty models; Heterogeneity-aware modeling; Oncohematology; Synthetic data; Benchmarking;
Data di discussione
4 Novembre 2025
URI

Altri metadati

Gestione del documento: Visualizza la tesi

^