Tosi, Fabio
(2021)
Deep-learning for 3D reconstruction, [Dissertation thesis], Alma Mater Studiorum Università di Bologna.
Dottorato di ricerca in
Computer science and engineering, 33 Ciclo. DOI 10.48676/unibo/amsdottorato/9816.
Documenti full-text disponibili:
|
Documento PDF (English)
- Richiede un lettore di PDF come Xpdf o Adobe Acrobat Reader
Disponibile con Licenza: Salvo eventuali più ampie autorizzazioni dell'autore, la tesi può essere liberamente consultata e può essere effettuato il salvataggio e la stampa di una copia per fini strettamente personali di studio, di ricerca e di insegnamento, con espresso divieto di qualunque utilizzo direttamente o indirettamente commerciale. Ogni altro diritto sul materiale è riservato.
Download (108MB)
|
Abstract
Depth perception is paramount for many computer vision applications such as autonomous
driving and augmented reality. Despite active sensors (e.g., LiDAR, Time-of-Flight, struc-
tured light) are quite diffused, they have severe shortcomings that could be potentially
addressed by image-based sensors. Concerning this latter category, deep learning has
enabled ground-breaking results in tackling well-known issues affecting the accuracy of
systems inferring depth from a single or multiple images in specific circumstances (e.g.,
low textured regions, depth discontinuities, etc.), but also introduced additional concerns
about the domain shift occurring between training and target environments and the need
of proper ground truth depth labels to be used as the training signals in network learning.
Moreover, despite the copious literature concerning confidence estimation for depth from a
stereo setup, inferring depth uncertainty when dealing with deep networks is still a major
challenge and almost unexplored research area, especially when dealing with a monocular
setup. Finally, computational complexity is another crucial aspect to be considered when
targeting most practical applications and hence is desirable not only to infer reliable depth
data but do so in real-time and with low power requirements even on standard embedded
devices or smartphones.
Therefore, focusing on stereo and monocular setups, this thesis tackles major issues
affecting methodologies to infer depth from images and aims at developing accurate and
efficient frameworks for accurate 3D reconstruction on challenging environments.
Abstract
Depth perception is paramount for many computer vision applications such as autonomous
driving and augmented reality. Despite active sensors (e.g., LiDAR, Time-of-Flight, struc-
tured light) are quite diffused, they have severe shortcomings that could be potentially
addressed by image-based sensors. Concerning this latter category, deep learning has
enabled ground-breaking results in tackling well-known issues affecting the accuracy of
systems inferring depth from a single or multiple images in specific circumstances (e.g.,
low textured regions, depth discontinuities, etc.), but also introduced additional concerns
about the domain shift occurring between training and target environments and the need
of proper ground truth depth labels to be used as the training signals in network learning.
Moreover, despite the copious literature concerning confidence estimation for depth from a
stereo setup, inferring depth uncertainty when dealing with deep networks is still a major
challenge and almost unexplored research area, especially when dealing with a monocular
setup. Finally, computational complexity is another crucial aspect to be considered when
targeting most practical applications and hence is desirable not only to infer reliable depth
data but do so in real-time and with low power requirements even on standard embedded
devices or smartphones.
Therefore, focusing on stereo and monocular setups, this thesis tackles major issues
affecting methodologies to infer depth from images and aims at developing accurate and
efficient frameworks for accurate 3D reconstruction on challenging environments.
Tipologia del documento
Tesi di dottorato
Autore
Tosi, Fabio
Supervisore
Dottorato di ricerca
Ciclo
33
Coordinatore
Settore disciplinare
Settore concorsuale
Parole chiave
Deep Learning, Machine Learning, Computer Vision, 3D reconstruction, Stereo Matching, Monocular Depth Estimation, Embedded Vision
URN:NBN
DOI
10.48676/unibo/amsdottorato/9816
Data di discussione
27 Maggio 2021
URI
Altri metadati
Tipologia del documento
Tesi di dottorato
Autore
Tosi, Fabio
Supervisore
Dottorato di ricerca
Ciclo
33
Coordinatore
Settore disciplinare
Settore concorsuale
Parole chiave
Deep Learning, Machine Learning, Computer Vision, 3D reconstruction, Stereo Matching, Monocular Depth Estimation, Embedded Vision
URN:NBN
DOI
10.48676/unibo/amsdottorato/9816
Data di discussione
27 Maggio 2021
URI
Statistica sui download
Gestione del documento: