Power and Thermal Management Runtimes for HPC Applications in the Era of Exascale Computing

Cesarini, Daniele (2019) Power and Thermal Management Runtimes for HPC Applications in the Era of Exascale Computing, [Dissertation thesis], Alma Mater Studiorum Università di Bologna. Dottorato di ricerca in Ingegneria elettronica, telecomunicazioni e tecnologie dell'informazione, 31 Ciclo. DOI 10.6092/unibo/amsdottorato/8983.
Documenti full-text disponibili:
[img] Documento PDF (English) - Richiede un lettore di PDF come Xpdf o Adobe Acrobat Reader
Disponibile con Licenza: Salvo eventuali più ampie autorizzazioni dell'autore, la tesi può essere liberamente consultata e può essere effettuato il salvataggio e la stampa di una copia per fini strettamente personali di studio, di ricerca e di insegnamento, con espresso divieto di qualunque utilizzo direttamente o indirettamente commerciale. Ogni altro diritto sul materiale è riservato.
Download (4MB)

Abstract

In the scope of technical and scientific computing the rush towards larger simulations, has been so far assisted by a steady downsizing of micro-processing units, which has allowed to increase the compute capacity of general-purpose architectures at constant power. As side effects of the end of Dennard's scaling, this process is now hitting its ultimate power limits and is just about to come to an end. The continuous grow of power consumption in supercomputers, requires a well-defined power budget at design time which should considers the worst-case power consumption to avoid outages. But supercomputers rarely cause the worst-case power consumption during their lifetime limiting the performance achievable in normal conditions. Another drawback of the end of the Dennard's scaling is that power density starts to increase at every technological step leading to overheating and thermal gradients. As result, thermal-bound machines show performance degradation and heterogeneity which limit the peak performance of the system. Moreover, it is well known that in large application runs, the time spent by the application in the communication is not negligible and impacts the power consumption of the system. This thesis presents software strategies to tackle the main bottlenecks induced by power and thermal issues that affects next-generation supercomputers. The thesis targets scientific applications which are the principal candidates “suffering” from the power and thermal constraints of supercomputers. To respond to the above challenges, this work shows that propagating workload requirements from application to the runtime and operating system levels is the key to provide efficiency. This is possible only if the proposed software methodologies cause little or no overhead in term of application performance. The experimental results show a significant step forward with respect to the current state-of-the-art solutions in power and thermal control of HPC systems.

Abstract
Tipologia del documento
Tesi di dottorato
Autore
Cesarini, Daniele
Supervisore
Co-supervisore
Dottorato di ricerca
Ciclo
31
Coordinatore
Settore disciplinare
Settore concorsuale
Parole chiave
HPC, MPI, profiling, power management, thermal management, DVFS, hardware performance counters, energy saving, power saving, ILP, optimal control, power capping, RAPL, P-state, C-state, idleness
URN:NBN
DOI
10.6092/unibo/amsdottorato/8983
Data di discussione
8 Aprile 2019
URI

Altri metadati

Statistica sui download

Gestione del documento: Visualizza la tesi

^