Derus, Nicolas Riccardo
(2025)
Advanced statistical and machine learning techniques for multi-omics data integration, [Dissertation thesis], Alma Mater Studiorum Università di Bologna.
Dottorato di ricerca in
Scienze e tecnologie della salute, 37 Ciclo.
Documenti full-text disponibili:
Abstract
Multi-omics integration promises to revolutionize precision medicine by combining diverse molecular data types to comprehensively characterize disease. However, these datasets present significant challenges due to their high dimensionality and inherent heterogeneity across both patient populations and molecular information, leading to biased predictions that fail to capture true biological signals. This dissertation develops a framework for heterogeneity-aware multi-omics integration that explicitly models these sources of variability to improve both predictive accuracy and biological interpretability. Our approach centers on a shared frailty survival model using multitask learning to discover latent patient subgroups and provide heterogeneity-adjusted survival estimates. Moreover, we provide an analytical framework based on mutual information decomposition that reveals how different fusion architectures preserve information across modalities, demonstrating that certain integration approaches better capture cross-modal relationships. To ensure robust evaluation, we develop a comprehensive benchmarking platform that incorporates realistic heterogeneous conditions through both real-world datasets and controlled simulations. We validate our methods through application to the myelodysplastic syndrome–acute myeloid leukemia continuum, where our heterogeneity-aware approach identifies previously hidden patient subgroups with distinct molecular mechanisms and treatment responses. This work demonstrates that accounting for population heterogeneity is essential for accurate prediction and biological interpretation in multi-omics analysis, providing both analytical foundations and practical tools that enable the translation of complex molecular data into actionable clinical insights.
Abstract
Multi-omics integration promises to revolutionize precision medicine by combining diverse molecular data types to comprehensively characterize disease. However, these datasets present significant challenges due to their high dimensionality and inherent heterogeneity across both patient populations and molecular information, leading to biased predictions that fail to capture true biological signals. This dissertation develops a framework for heterogeneity-aware multi-omics integration that explicitly models these sources of variability to improve both predictive accuracy and biological interpretability. Our approach centers on a shared frailty survival model using multitask learning to discover latent patient subgroups and provide heterogeneity-adjusted survival estimates. Moreover, we provide an analytical framework based on mutual information decomposition that reveals how different fusion architectures preserve information across modalities, demonstrating that certain integration approaches better capture cross-modal relationships. To ensure robust evaluation, we develop a comprehensive benchmarking platform that incorporates realistic heterogeneous conditions through both real-world datasets and controlled simulations. We validate our methods through application to the myelodysplastic syndrome–acute myeloid leukemia continuum, where our heterogeneity-aware approach identifies previously hidden patient subgroups with distinct molecular mechanisms and treatment responses. This work demonstrates that accounting for population heterogeneity is essential for accurate prediction and biological interpretation in multi-omics analysis, providing both analytical foundations and practical tools that enable the translation of complex molecular data into actionable clinical insights.
Tipologia del documento
Tesi di dottorato
Autore
Derus, Nicolas Riccardo
Supervisore
Dottorato di ricerca
Ciclo
37
Coordinatore
Settore disciplinare
Settore concorsuale
Parole chiave
Multi-omics; Data integration; Survival analysis; Machine learning; Frailty models; Heterogeneity-aware modeling; Oncohematology; Synthetic data; Benchmarking;
Data di discussione
4 Novembre 2025
URI
Altri metadati
Tipologia del documento
Tesi di dottorato
Autore
Derus, Nicolas Riccardo
Supervisore
Dottorato di ricerca
Ciclo
37
Coordinatore
Settore disciplinare
Settore concorsuale
Parole chiave
Multi-omics; Data integration; Survival analysis; Machine learning; Frailty models; Heterogeneity-aware modeling; Oncohematology; Synthetic data; Benchmarking;
Data di discussione
4 Novembre 2025
URI
Gestione del documento: