Compact and effective models for depth prediction

Fan, Rizhao (2024) Compact and effective models for depth prediction, [Dissertation thesis], Alma Mater Studiorum Università di Bologna. Dottorato di ricerca in Computer science and engineering, 36 Ciclo. DOI 10.48676/unibo/amsdottorato/11127.

Salva citazione

Citato da

Documenti full-text disponibili:

[thumbnail of Compact and Effective Models for Depth Prediction.pdf]

Documento PDF (English) - Richiede un lettore di PDF come Xpdf o Adobe Acrobat Reader
Disponibile con Licenza: Salvo eventuali più ampie autorizzazioni dell'autore, la tesi può essere liberamente consultata e può essere effettuato il salvataggio e la stampa di una copia per fini strettamente personali di studio, di ricerca e di insegnamento, con espresso divieto di qualunque utilizzo direttamente o indirettamente commerciale. Ogni altro diritto sul materiale è riservato.
Download (17MB)

Abstract

Depth prediction is at the core of several computer vision applications, such as autonomous driving, augmented reality, and robotics. Approaches for depth prediction can be divided into two main classes: active and passive sensing. Active depth sensing is the de-facto standard of applications requiring depth sensing on its excellent accuracy and low latency in varied environments. Passive depth sensing using cameras requires a large baseline and careful calibration to obtain accurate depth results. Deep learning has significantly facilitated the development of dense depth prediction, affecting the accuracy of models inferring depth from images or multi-modal data. Moreover, despite the wide literature concerning depth prediction, there are open problems. Most works adopted computationally expensive models, posing a significant challenge for devices with limited computational resources. Furthermore, current models seldom study the inherent characteristics of depth, which still hold significant potential. This thesis focuses on addressing some issues related to depth prediction. Compact and effective models were proposed to recover dense results across diverse settings, both supervised and self-supervised methods, from color camera images and sparse LiDAR measurements. Additionally, by analyzing the characteristics of the depth map, contrastive learning techniques are introduced to improve the depth rediction network’s learning ability and unlock further potential. All experiments are validated on the commonly used datasets, including KITTI and the NYU depth v2 dataset, following the standard metrics to compare our proposals with previous representative state-of-the-art works.

Abstract

Tipologia del documento

Tesi di dottorato

Autore

Fan, Rizhao

Supervisore

Mattoccia, Stefano

Co-supervisore

Poggi, Matteo

Dottorato di ricerca

Computer science and engineering

Ciclo