Falcone, Roberta
(2020)
Supervised Classification with Matrix Sketching, [Dissertation thesis], Alma Mater Studiorum Università di Bologna.
Dottorato di ricerca in
Scienze statistiche, 32 Ciclo. DOI 10.6092/unibo/amsdottorato/9348.
Documenti full-text disponibili:
Abstract
Matrix sketching is a recently developed data compression technique. An input matrix A is efficiently approximated with a smaller matrix B, so that B preserves most of the properties of A up to some guaranteed approximation ratio. In so doing numerical operations on big data sets become faster. Sketching algorithms generally use random projections to compress the original dataset and this stochastic generation process makes them amenable to statistical analysis. The statistical properties of sketched regression algorithms have been widely studied previously. We study the performances of sketching algorithms in the supervised classification context, both in terms of misclassification rate and of boundary approximation, as the degree of sketching increases. We also address, through sketching, the issue of unbalanced classes, which hampers most of the common classification methods.
Abstract
Matrix sketching is a recently developed data compression technique. An input matrix A is efficiently approximated with a smaller matrix B, so that B preserves most of the properties of A up to some guaranteed approximation ratio. In so doing numerical operations on big data sets become faster. Sketching algorithms generally use random projections to compress the original dataset and this stochastic generation process makes them amenable to statistical analysis. The statistical properties of sketched regression algorithms have been widely studied previously. We study the performances of sketching algorithms in the supervised classification context, both in terms of misclassification rate and of boundary approximation, as the degree of sketching increases. We also address, through sketching, the issue of unbalanced classes, which hampers most of the common classification methods.
Tipologia del documento
Tesi di dottorato
Autore
Falcone, Roberta
Supervisore
Dottorato di ricerca
Ciclo
32
Coordinatore
Settore disciplinare
Settore concorsuale
Parole chiave
Multivariate Analysis; Supervised Classification; Matrix Sketching; Data Compression
URN:NBN
DOI
10.6092/unibo/amsdottorato/9348
Data di discussione
2 Aprile 2020
URI
Altri metadati
Tipologia del documento
Tesi di dottorato
Autore
Falcone, Roberta
Supervisore
Dottorato di ricerca
Ciclo
32
Coordinatore
Settore disciplinare
Settore concorsuale
Parole chiave
Multivariate Analysis; Supervised Classification; Matrix Sketching; Data Compression
URN:NBN
DOI
10.6092/unibo/amsdottorato/9348
Data di discussione
2 Aprile 2020
URI
Statistica sui download
Gestione del documento: