Perrone, Gabriele
(2023)
Seemingly unrelated linear regression for contaminated data based on Gaussian mixtures, [Dissertation thesis], Alma Mater Studiorum Università di Bologna.
Dottorato di ricerca in
Scienze statistiche, 35 Ciclo. DOI 10.48676/unibo/amsdottorato/11003.
Documenti full-text disponibili:
|
Documento PDF (English)
- Richiede un lettore di PDF come Xpdf o Adobe Acrobat Reader
Disponibile con Licenza: Salvo eventuali più ampie autorizzazioni dell'autore, la tesi può essere liberamente consultata e può essere effettuato il salvataggio e la stampa di una copia per fini strettamente personali di studio, di ricerca e di insegnamento, con espresso divieto di qualunque utilizzo direttamente o indirettamente commerciale. Ogni altro diritto sul materiale è riservato.
Download (5MB)
|
Abstract
In this thesis, new classes of models for multivariate linear regression defined by finite mixtures of seemingly unrelated contaminated normal regression models and seemingly unrelated contaminated normal cluster-weighted models are illustrated. The main difference between such families is that the covariates are treated as fixed in the former class of models and as random in the latter. Thus, in cluster-weighted models the assignment of the data points to the unknown groups of observations depends also by the covariates. These classes provide an extension to mixture-based regression analysis for modelling multivariate and correlated responses in the presence of mild outliers that allows to specify a different vector of regressors for the prediction of each response. Expectation-conditional maximisation algorithms for the calculation of the maximum likelihood estimate of the model parameters have been derived. As the number of free parameters incresases quadratically with the number of responses and the covariates, analyses based on the proposed models can become unfeasible in practical applications. These problems have been overcome by introducing constraints on the elements of the covariance matrices according to an approach based on the eigen-decomposition of the covariance matrices. The performances of the new models have been studied by simulations and using real datasets in comparison with other models. In order to gain additional flexibility, mixtures of seemingly unrelated contaminated normal regressions models have also been specified so as to allow mixing proportions to be expressed as functions of concomitant covariates. An illustration of the new models with concomitant variables and a study on housing tension in the municipalities of the Emilia-Romagna region based on different types of multivariate linear regression models have been performed.
Abstract
In this thesis, new classes of models for multivariate linear regression defined by finite mixtures of seemingly unrelated contaminated normal regression models and seemingly unrelated contaminated normal cluster-weighted models are illustrated. The main difference between such families is that the covariates are treated as fixed in the former class of models and as random in the latter. Thus, in cluster-weighted models the assignment of the data points to the unknown groups of observations depends also by the covariates. These classes provide an extension to mixture-based regression analysis for modelling multivariate and correlated responses in the presence of mild outliers that allows to specify a different vector of regressors for the prediction of each response. Expectation-conditional maximisation algorithms for the calculation of the maximum likelihood estimate of the model parameters have been derived. As the number of free parameters incresases quadratically with the number of responses and the covariates, analyses based on the proposed models can become unfeasible in practical applications. These problems have been overcome by introducing constraints on the elements of the covariance matrices according to an approach based on the eigen-decomposition of the covariance matrices. The performances of the new models have been studied by simulations and using real datasets in comparison with other models. In order to gain additional flexibility, mixtures of seemingly unrelated contaminated normal regressions models have also been specified so as to allow mixing proportions to be expressed as functions of concomitant covariates. An illustration of the new models with concomitant variables and a study on housing tension in the municipalities of the Emilia-Romagna region based on different types of multivariate linear regression models have been performed.
Tipologia del documento
Tesi di dottorato
Autore
Perrone, Gabriele
Supervisore
Dottorato di ricerca
Ciclo
35
Coordinatore
Settore disciplinare
Settore concorsuale
Parole chiave
Contaminated Gaussian distribution, ECM algorithm, Mild outlier, Mixture of regression models, Model-based cluster analysis, Parsimonious model, Seemingly unrelated regression
URN:NBN
DOI
10.48676/unibo/amsdottorato/11003
Data di discussione
15 Giugno 2023
URI
Altri metadati
Tipologia del documento
Tesi di dottorato
Autore
Perrone, Gabriele
Supervisore
Dottorato di ricerca
Ciclo
35
Coordinatore
Settore disciplinare
Settore concorsuale
Parole chiave
Contaminated Gaussian distribution, ECM algorithm, Mild outlier, Mixture of regression models, Model-based cluster analysis, Parsimonious model, Seemingly unrelated regression
URN:NBN
DOI
10.48676/unibo/amsdottorato/11003
Data di discussione
15 Giugno 2023
URI
Statistica sui download
Gestione del documento: