Stochastic Modeling and Statistical Properties of Biological Systems Inferred from Omics Data

Sala, Claudia (2017) Stochastic Modeling and Statistical Properties of Biological Systems Inferred from Omics Data, [Dissertation thesis], Alma Mater Studiorum Università di Bologna. Dottorato di ricerca in Fisica, 29 Ciclo. DOI 10.6092/unibo/amsdottorato/7810.
Documenti full-text disponibili:
Documento PDF (English) - Richiede un lettore di PDF come Xpdf o Adobe Acrobat Reader
Disponibile con Licenza: Salvo eventuali più ampie autorizzazioni dell'autore, la tesi può essere liberamente consultata e può essere effettuato il salvataggio e la stampa di una copia per fini strettamente personali di studio, di ricerca e di insegnamento, con espresso divieto di qualunque utilizzo direttamente o indirettamente commerciale. Ogni altro diritto sul materiale è riservato.
Download (13MB) | Anteprima


In this thesis we aim to describe the dynamic processes that govern the evolution of two very different ecological systems. First, we consider the ensemble of bacteria that populate the intestine (Gut Microbiota, GM), which has been proven to have great impact on human health, being associated to several metabolic and immunological diseases. Then, we deal with the set of protein domains enclosed in the genome of living organisms. In general, the neutrality hypothesis, that was proposed by Hubbell as the Ockham’s razor for ecology, is a respectable approximation for both the GM and the protein domains ecosystems. In the first case, a birth-death model that takes into account demographic noise is able to describe the population dynamics if we relax the neutrality assumption and consider two non-interacting niches in which species equivalence holds. Interestingly, the biodiversity index derived from our modeling predicts healthy aging with better accuracy than common indices. When constructing the empirical Relative Species Abundances distribution (RSA) for GM, a fundamental step regards the clustering of particular DNA sequences (16S rRNA). This is a critical task that enables to redefine the concept of species according to the phylogenetic tree. Here we introduce LOC-kNN, that is a parameter-free clustering algorithm recently developed by d’Errico et al, and we adapt it for this purpose. LOC-kNN detects clusters as density peaks based on the dataset topography and, besides still having difficulties in detecting small clusters, shows promising performances. Finally, for what concerns the protein domains ecosystem, environmental noise should also be taken into account. This has a multiplicative effect and, together with the introduction of the Gompertzian death hypothesis, predicts a Poisson Log-Normal RSA. The model fits well the protein domain RSA and captures the dynamics of genome evolution, manifesting good agreement with the phylogenetic distances among bacteria.

Tipologia del documento
Tesi di dottorato
Sala, Claudia
Dottorato di ricerca
Settore disciplinare
Settore concorsuale
Parole chiave
Stochastic processes; Chemical Master Equation; Ecological Theory; Population Dynamics; Relative Species Abundance; Biodiversity; Neutrality; 16S rRNA; Operational Taxonomic Units; Clustering; Gut Microbiota; Protein Domains; Genome Evolution
Data di discussione
22 Marzo 2017

Altri metadati

Statistica sui download

Gestione del documento: Visualizza la tesi