Enhancing big data analysis in two-wheeled vehicles through machine learning techniques

Pennino, Federico (2026) Enhancing big data analysis in two-wheeled vehicles through machine learning techniques, [Dissertation thesis], Alma Mater Studiorum Università di Bologna. Dottorato di ricerca in Computer science and engineering, 38 Ciclo. DOI 10.48676/unibo/amsdottorato/12582.
Documenti full-text disponibili:
[thumbnail of tesi_finale.pdf] Documento PDF (English) - Richiede un lettore di PDF come Xpdf o Adobe Acrobat Reader
Disponibile con Licenza: Salvo eventuali più ampie autorizzazioni dell'autore, la tesi può essere liberamente consultata e può essere effettuato il salvataggio e la stampa di una copia per fini strettamente personali di studio, di ricerca e di insegnamento, con espresso divieto di qualunque utilizzo direttamente o indirettamente commerciale. Ogni altro diritto sul materiale è riservato.
Download (7MB)

Abstract

Modern motorcycles generate vast, high-dimensional data streams that hold immense potential but whose raw, complex, and largely unlabeled nature presents a significant barrier to extracting meaningful intelligence. This thesis introduces a suite of self-supervised learning techniques to systematically transform this data into structured and interpretable knowledge. The research begins by addressing rider identification, employing a triplet loss framework on weakly supervised data to create discriminative “behavioral fingerprints” and impose a first layer of semantic structure. Building on this, a contextual contrastive learning framework is developed to understand the dynamic nature of a ride by modeling the temporal relationships between adjacent segments, thereby capturing sequential riding patterns. The analysis then models the direct interplay between rider and vehicle using a novel dual-encoder architecture that learns a shared latent space for rider inputs and vehicle states, explicitly capturing their relationship. To address the practical challenge of efficient data retrieval at scale, the thesis introduces Trajectory-Embedded Matryoshka Representation Learning, which creates nested, multi-scale embeddings to accelerate similarity searches without compromising performance. Finally, the work challenges the “more data is better” paradigm with a data-centric optimization framework. By leveraging a foundation model to curate smaller, higher-quality “training diets,” this approach demonstrates superior performance on a downstream time-series forecasting task. Collectively, these contributions demonstrate a methodological progression --- from structuring raw signals and learning complex behaviors to optimizing data retrieval and engineering the training set itself --- providing a comprehensive framework for turning unstructured sensor data from two-wheeled vehicles into actionable intelligence.

Abstract
Tipologia del documento
Tesi di dottorato
Autore
Pennino, Federico
Supervisore
Co-supervisore
Dottorato di ricerca
Ciclo
38
Coordinatore
Settore disciplinare
Settore concorsuale
Parole chiave
self-supervised learning, contrastive learning, rider identification, time-series sensor data, representation learning, similarity search, data-centric optimization
DOI
10.48676/unibo/amsdottorato/12582
Data di discussione
25 Marzo 2026
URI

Altri metadati

Statistica sui download

Gestione del documento: Visualizza la tesi

^