Lorusso, Marco
(2025)
Optimization of ML-Based BSM triggering with knowledge distillation for FPGA implementation in the CMS Level-1 trigger, [Dissertation thesis], Alma Mater Studiorum Università di Bologna.
Dottorato di ricerca in
Fisica, 37 Ciclo. DOI 10.48676/unibo/amsdottorato/11891.
Documenti full-text disponibili:
Abstract
The High Luminosity LHC (HL-LHC) Project, launched in 2010, aims to boost the luminosity of the Large Hadron Collider (LHC) at CERN in Geneva tenfold to enhance discoveries and precision measurements. The higher collision rate and pileup will increase particle multiplicity and radiation, requiring improvements in the Trigger system to sustain performance.
In this context, the scope of applications for Machine Learning, particularly Artificial Neural Network algorithms, has experienced an exponential expansion due to their considerable potential for elevating the efficiency and efficacy of data processing in this experimental setting. However, a key challenge in ANN deployment is optimizing data processing for online applications, especially in selecting rare events at the trigger level, such as Beyond Standard Model (BSM) events. This study explores Autoencoders (AEs), which detect anomalies without theoretical priors. Yet, the stringent latency and energy constraints in the Level-1 Trigger domain at CERN’s Compact Muon Solenoid (CMS) require tailored software and hardware strategies, focusing on Field Programmable Gate Arrays (FPGAs). To address this, Knowledge Distillation (KD) is investigated, using a well-trained AE “teacher” to train a compact “student” model for FPGA implementation.
This distillation process can be optimized by refining student architecture, weight quantization, and hyperparameters to balance accuracy, latency, and hardware footprint.
The Offline Response Based KD strategy for the teacher model will be presented, including performance differences when applying quantization before or after selecting the best student architecture. The process of converting a Python-based model into FPGA firmware using hls4ml and proprietary FPGA software will also be detailed. Additionally, Online Response Based KD was explored, with preliminary results provided.
Finally, a new teacher model using a Graph Convolutional Neural Network-based AE was tested for anomaly detection, due to the possibilities opened up by KD to implement advanced algorithms on efficient hardware.
Abstract
The High Luminosity LHC (HL-LHC) Project, launched in 2010, aims to boost the luminosity of the Large Hadron Collider (LHC) at CERN in Geneva tenfold to enhance discoveries and precision measurements. The higher collision rate and pileup will increase particle multiplicity and radiation, requiring improvements in the Trigger system to sustain performance.
In this context, the scope of applications for Machine Learning, particularly Artificial Neural Network algorithms, has experienced an exponential expansion due to their considerable potential for elevating the efficiency and efficacy of data processing in this experimental setting. However, a key challenge in ANN deployment is optimizing data processing for online applications, especially in selecting rare events at the trigger level, such as Beyond Standard Model (BSM) events. This study explores Autoencoders (AEs), which detect anomalies without theoretical priors. Yet, the stringent latency and energy constraints in the Level-1 Trigger domain at CERN’s Compact Muon Solenoid (CMS) require tailored software and hardware strategies, focusing on Field Programmable Gate Arrays (FPGAs). To address this, Knowledge Distillation (KD) is investigated, using a well-trained AE “teacher” to train a compact “student” model for FPGA implementation.
This distillation process can be optimized by refining student architecture, weight quantization, and hyperparameters to balance accuracy, latency, and hardware footprint.
The Offline Response Based KD strategy for the teacher model will be presented, including performance differences when applying quantization before or after selecting the best student architecture. The process of converting a Python-based model into FPGA firmware using hls4ml and proprietary FPGA software will also be detailed. Additionally, Online Response Based KD was explored, with preliminary results provided.
Finally, a new teacher model using a Graph Convolutional Neural Network-based AE was tested for anomaly detection, due to the possibilities opened up by KD to implement advanced algorithms on efficient hardware.
Tipologia del documento
Tesi di dottorato
Autore
Lorusso, Marco
Supervisore
Dottorato di ricerca
Ciclo
37
Coordinatore
Settore disciplinare
Settore concorsuale
Parole chiave
CMS, LHC, CERN, Beyond Standard Model, Machine Learning, Artificial Neural Networks, Autoencoders, Knowledge Distillation, Field Programmable Gate Arrays, Anomaly Detection
DOI
10.48676/unibo/amsdottorato/11891
Data di discussione
28 Marzo 2025
URI
Altri metadati
Tipologia del documento
Tesi di dottorato
Autore
Lorusso, Marco
Supervisore
Dottorato di ricerca
Ciclo
37
Coordinatore
Settore disciplinare
Settore concorsuale
Parole chiave
CMS, LHC, CERN, Beyond Standard Model, Machine Learning, Artificial Neural Networks, Autoencoders, Knowledge Distillation, Field Programmable Gate Arrays, Anomaly Detection
DOI
10.48676/unibo/amsdottorato/11891
Data di discussione
28 Marzo 2025
URI
Statistica sui download
Gestione del documento: