Integrated Programmable-Array accelerator to design heterogeneous ultra-low power manycore architectures

Prasad, Rohit (2022) Integrated Programmable-Array accelerator to design heterogeneous ultra-low power manycore architectures, [Dissertation thesis], Alma Mater Studiorum Università di Bologna. Dottorato di ricerca in Ingegneria elettronica, telecomunicazioni e tecnologie dell'informazione, 35 Ciclo. DOI 10.48676/unibo/amsdottorato/9983.
Documenti full-text disponibili:
[img] Documento PDF (English) - Richiede un lettore di PDF come Xpdf o Adobe Acrobat Reader
Disponibile con Licenza: Salvo eventuali più ampie autorizzazioni dell'autore, la tesi può essere liberamente consultata e può essere effettuato il salvataggio e la stampa di una copia per fini strettamente personali di studio, di ricerca e di insegnamento, con espresso divieto di qualunque utilizzo direttamente o indirettamente commerciale. Ogni altro diritto sul materiale è riservato.
Download (3MB)

Abstract

There is an ever-increasing demand for energy efficiency (EE) in rapidly evolving Internet-of-Things end nodes. This pushes researchers and engineers to develop solutions that provide both Application-Specific Integrated Circuit-like EE and Field-Programmable Gate Array-like flexibility. One such solution is Coarse Grain Reconfigurable Array (CGRA). Over the past decades, CGRAs have evolved and are competing to become mainstream hardware accelerators, especially for accelerating Digital Signal Processing (DSP) applications. Due to the over-specialization of computing architectures, the focus is shifting towards fitting an extensive data representation range into fewer bits, e.g., a 32-bit space can represent a more extensive data range with floating-point (FP) representation than an integer representation. Computation using FP representation requires numerous encodings and leads to complex circuits for the FP operators, decreasing the EE of the entire system. This thesis presents the design of an EE ultra-low-power CGRA with native support for FP computation by leveraging an emerging paradigm of approximate computing called transprecision computing. We also present the contributions in the compilation toolchain and system-level integration of CGRA in a System-on-Chip, to envision the proposed CGRA as an EE hardware accelerator. Finally, an extensive set of experiments using real-world algorithms employed in near-sensor processing applications are performed, and results are compared with state-of-the-art (SoA) architectures. It is empirically shown that our proposed CGRA provides better results w.r.t. SoA architectures in terms of power, performance, and area.

Abstract
Tipologia del documento
Tesi di dottorato
Autore
Prasad, Rohit
Supervisore
Co-supervisore
Dottorato di ricerca
Ciclo
35
Coordinatore
Settore disciplinare
Settore concorsuale
Parole chiave
Coarse Grain Reconfigurable Architecture, Transprecision Computing, Digital Signal Processor, Ultra-Low-Power, Energy-Efficient, Heterogeneous Cluster, Floating-Point
URN:NBN
DOI
10.48676/unibo/amsdottorato/9983
Data di discussione
20 Febbraio 2022
URI

Altri metadati

Statistica sui download

Gestione del documento: Visualizza la tesi

^