Optimization of ML-Based BSM triggering with knowledge distillation for FPGA implementation in the CMS Level-1 trigger

Lorusso, Marco (2025) Optimization of ML-Based BSM triggering with knowledge distillation for FPGA implementation in the CMS Level-1 trigger, [Dissertation thesis], Alma Mater Studiorum Università di Bologna. Dottorato di ricerca in Fisica, 37 Ciclo. DOI 10.48676/unibo/amsdottorato/11891.

Salva citazione

Citato da

Documenti full-text disponibili:

Documento PDF (English) - Richiede un lettore di PDF come Xpdf o Adobe Acrobat Reader
Disponibile con Licenza: Creative Commons: Attribuzione - Non Commerciale - Condividi allo Stesso Modo 4.0 (CC BY-NC-SA 4.0) .
Download (14MB)

Abstract

The High Luminosity LHC (HL-LHC) Project, launched in 2010, aims to boost the luminosity of the Large Hadron Collider (LHC) at CERN in Geneva tenfold to enhance discoveries and precision measurements. The higher collision rate and pileup will increase particle multiplicity and radiation, requiring improvements in the Trigger system to sustain performance. In this context, the scope of applications for Machine Learning, particularly Artificial Neural Network algorithms, has experienced an exponential expansion due to their considerable potential for elevating the efficiency and efficacy of data processing in this experimental setting. However, a key challenge in ANN deployment is optimizing data processing for online applications, especially in selecting rare events at the trigger level, such as Beyond Standard Model (BSM) events. This study explores Autoencoders (AEs), which detect anomalies without theoretical priors. Yet, the stringent latency and energy constraints in the Level-1 Trigger domain at CERN’s Compact Muon Solenoid (CMS) require tailored software and hardware strategies, focusing on Field Programmable Gate Arrays (FPGAs). To address this, Knowledge Distillation (KD) is investigated, using a well-trained AE “teacher” to train a compact “student” model for FPGA implementation. This distillation process can be optimized by refining student architecture, weight quantization, and hyperparameters to balance accuracy, latency, and hardware footprint. The Offline Response Based KD strategy for the teacher model will be presented, including performance differences when applying quantization before or after selecting the best student architecture. The process of converting a Python-based model into FPGA firmware using hls4ml and proprietary FPGA software will also be detailed. Additionally, Online Response Based KD was explored, with preliminary results provided. Finally, a new teacher model using a Graph Convolutional Neural Network-based AE was tested for anomaly detection, due to the possibilities opened up by KD to implement advanced algorithms on efficient hardware.

Abstract

Tipologia del documento

Tesi di dottorato

Autore

Lorusso, Marco

Supervisore

Bonacorsi, Daniele

Dottorato di ricerca

Fisica

Ciclo

Coordinatore

Gabrielli, Alessandro

Settore disciplinare

Area 02 - Scienze fisiche > FIS/01 Fisica sperimentale

Settore concorsuale

Area 02 - Scienze fisiche > 02/A - Fisica delle interazioni fondamentali > 02/A1 Fisica sperimentale delle interazioni fondamentali

Parole chiave

CMS, LHC, CERN, Beyond Standard Model, Machine Learning, Artificial Neural Networks, Autoencoders, Knowledge Distillation, Field Programmable Gate Arrays, Anomaly Detection

DOI

10.48676/unibo/amsdottorato/11891

Data di discussione

28 Marzo 2025

URI

https://amsdottorato.unibo.it/id/eprint/11891