Documenti full-text disponibili:
Abstract
Depth represents a crucial piece of information in many practical applications, such as obstacle avoidance and environment mapping. This information can be provided either by active sensors, such as LiDARs, or by passive devices like cameras. A popular passive device is the binocular rig, which allows triangulating the depth of the scene through two synchronized and aligned cameras. However, many devices that are already available in several infrastructures are monocular passive sensors, such as most of the surveillance cameras. The intrinsic ambiguity of the problem makes monocular depth estimation a challenging task. Nevertheless, the recent progress of deep learning strategies is paving the way towards a new class of algorithms able to handle this complexity.
This work addresses many relevant topics related to the monocular depth estimation problem. It presents networks capable of predicting accurate depth values even on embedded devices and without the need of expensive ground-truth labels at training time. Moreover, it introduces strategies to estimate the uncertainty of these models, and it shows that monocular networks can easily generate training labels for different tasks at scale. Finally, it evaluates off-the-shelf monocular depth predictors for the relevant use case of social distance monitoring, and shows how this technology allows to overcome already existing strategies limitations.
Abstract
Depth represents a crucial piece of information in many practical applications, such as obstacle avoidance and environment mapping. This information can be provided either by active sensors, such as LiDARs, or by passive devices like cameras. A popular passive device is the binocular rig, which allows triangulating the depth of the scene through two synchronized and aligned cameras. However, many devices that are already available in several infrastructures are monocular passive sensors, such as most of the surveillance cameras. The intrinsic ambiguity of the problem makes monocular depth estimation a challenging task. Nevertheless, the recent progress of deep learning strategies is paving the way towards a new class of algorithms able to handle this complexity.
This work addresses many relevant topics related to the monocular depth estimation problem. It presents networks capable of predicting accurate depth values even on embedded devices and without the need of expensive ground-truth labels at training time. Moreover, it introduces strategies to estimate the uncertainty of these models, and it shows that monocular networks can easily generate training labels for different tasks at scale. Finally, it evaluates off-the-shelf monocular depth predictors for the relevant use case of social distance monitoring, and shows how this technology allows to overcome already existing strategies limitations.
Tipologia del documento
Tesi di dottorato
Autore
Aleotti, Filippo
Supervisore
Co-supervisore
Dottorato di ricerca
Ciclo
34
Coordinatore
Settore disciplinare
Settore concorsuale
Parole chiave
Monocular depth estimation, Depth estimation, Computer Vision, Deep Learning
URN:NBN
DOI
10.48676/unibo/amsdottorato/10228
Data di discussione
14 Giugno 2022
URI
Altri metadati
Tipologia del documento
Tesi di dottorato
Autore
Aleotti, Filippo
Supervisore
Co-supervisore
Dottorato di ricerca
Ciclo
34
Coordinatore
Settore disciplinare
Settore concorsuale
Parole chiave
Monocular depth estimation, Depth estimation, Computer Vision, Deep Learning
URN:NBN
DOI
10.48676/unibo/amsdottorato/10228
Data di discussione
14 Giugno 2022
URI
Statistica sui download
Gestione del documento: