D'Alberto, Riccardo
(2017)
Statistical Matching Imputation among Different Farm Data Sources, [Dissertation thesis], Alma Mater Studiorum Università di Bologna.
Dottorato di ricerca in
Scienze statistiche, 29 Ciclo. DOI 10.6092/unibo/amsdottorato/7788.
Documenti full-text disponibili:
Anteprima |
|
Documento PDF (English)
- Richiede un lettore di PDF come Xpdf o Adobe Acrobat Reader
Disponibile con Licenza: Salvo eventuali più ampie autorizzazioni dell'autore, la tesi può essere liberamente consultata e può essere effettuato il salvataggio e la stampa di una copia per fini strettamente personali di studio, di ricerca e di insegnamento, con espresso divieto di qualunque utilizzo direttamente o indirettamente commerciale. Ogni altro diritto sul materiale è riservato.
Download (2MB)
| Anteprima
|
Abstract
This work addresses the challenge of integrating different data sources, dealing with both statistical methodology and a practical application to farm data. It reviews the existing literature on Statistical Matching (SM) imputation, focusing on non-parametric micro “hot deck” techniques, which reduce the bias generated by model-based integration approaches. Implementing new combinations of these techniques with not commonly applied distance functions, we propose a strategy for the imputation goodness validation (missing in the SM imputation literature) corroborating the few common prescriptions from the literature. Both the combinations of the “hot deck” techniques and the imputation goodness validation strategy are applied to three different farm data sources referred to the Emilia-Romagna Region farms sample. Considering the different farm data sources integration issues, we propose also a reference framework for the farm data sources harmonization. Then, on the basis of the new synthetic dataset generated through imputation, we run a Propensity Score Matching (PSM) analysis, proving the usefulness of the consequent application of the SM imputation and the PSM methodologies under the observational studies research context. The main research finding concerns the relevant (significant) evidence that the common prescription of the SM literature (i.e. that the biggest donor-recipient dimensionality ratio is always the best one in terms of the imputation results) can be relaxed when the matching variable(s) in the donor dataset have a “proper” variability. Indeed, even a narrower dimensionality ratio, being the variance of the matching variable(s) in the recipient dataset lower than the one in the donor, can produce optimal estimates. Both the imputation goodness validation strategy and the reference framework for the farm data harmonization, constitute relevant research contributions. With respect to the PSM application, we discuss the significant effect of the farms Agri-Environmental Schemes uptake on the land rented in, taking into account the agricultural economics literature.
Abstract
This work addresses the challenge of integrating different data sources, dealing with both statistical methodology and a practical application to farm data. It reviews the existing literature on Statistical Matching (SM) imputation, focusing on non-parametric micro “hot deck” techniques, which reduce the bias generated by model-based integration approaches. Implementing new combinations of these techniques with not commonly applied distance functions, we propose a strategy for the imputation goodness validation (missing in the SM imputation literature) corroborating the few common prescriptions from the literature. Both the combinations of the “hot deck” techniques and the imputation goodness validation strategy are applied to three different farm data sources referred to the Emilia-Romagna Region farms sample. Considering the different farm data sources integration issues, we propose also a reference framework for the farm data sources harmonization. Then, on the basis of the new synthetic dataset generated through imputation, we run a Propensity Score Matching (PSM) analysis, proving the usefulness of the consequent application of the SM imputation and the PSM methodologies under the observational studies research context. The main research finding concerns the relevant (significant) evidence that the common prescription of the SM literature (i.e. that the biggest donor-recipient dimensionality ratio is always the best one in terms of the imputation results) can be relaxed when the matching variable(s) in the donor dataset have a “proper” variability. Indeed, even a narrower dimensionality ratio, being the variance of the matching variable(s) in the recipient dataset lower than the one in the donor, can produce optimal estimates. Both the imputation goodness validation strategy and the reference framework for the farm data harmonization, constitute relevant research contributions. With respect to the PSM application, we discuss the significant effect of the farms Agri-Environmental Schemes uptake on the land rented in, taking into account the agricultural economics literature.
Tipologia del documento
Tesi di dottorato
Autore
D'Alberto, Riccardo
Supervisore
Co-supervisore
Dottorato di ricerca
Ciclo
29
Coordinatore
Settore disciplinare
Settore concorsuale
Parole chiave
statistical matching, hot deck, imputation, propensity score, fadn
URN:NBN
DOI
10.6092/unibo/amsdottorato/7788
Data di discussione
15 Febbraio 2017
URI
Altri metadati
Tipologia del documento
Tesi di dottorato
Autore
D'Alberto, Riccardo
Supervisore
Co-supervisore
Dottorato di ricerca
Ciclo
29
Coordinatore
Settore disciplinare
Settore concorsuale
Parole chiave
statistical matching, hot deck, imputation, propensity score, fadn
URN:NBN
DOI
10.6092/unibo/amsdottorato/7788
Data di discussione
15 Febbraio 2017
URI
Statistica sui download
Gestione del documento: