Spezialetti, Riccardo
  
(2020)
Learning to understand the world in 3D, [Dissertation thesis], Alma Mater Studiorum Università di Bologna. 
 Dottorato di ricerca in 
Computer science and engineering, 32 Ciclo. DOI 10.6092/unibo/amsdottorato/9513.
  
 
  
  
        
        
        
  
  
  
  
  
  
  
    
  
    
      Documenti full-text disponibili:
      
    
  
  
    
      Abstract
      3D Computer vision is a research topic gathering even increasing attention thanks to the more and more widespread availability of off-the-shelf depth sensors and large-scale 3D datasets. The main purpose of 3D computer vision is to understand the geometry of the objects in order to interact
with them. Recently, the success of deep neural networks for processing images has fostered a data driven approach to solve 3D vision problems. Inspired by the potential of this field, in this thesis we will address two main problems: (a) how to leverage machine/deep learning techniques to build a robust and effective pipeline to establish correspondences between surfaces, and (b) how to obtain a reliable 3D reconstruction of an object using RGB images sparsely acquired from different point of views by means of deep neural networks. At the heart of many 3D computer vision applications lies surface matching, an effective paradigm aimed at finding correspondences between points belonging to different shapes. To this end, it is essential to first identify the characteristic points of an object and then create an adequate representation of them. We will refer to these two steps as keypoint detection and keypoint description, respectively. As a first contribution (a) of this Ph.D thesis, we will propose data driven solutions to tackle the problems of keypoint detection and description. As a further interesting direction of research, we investigate the problem of 3D object reconstruction from RGB data only (b). If in the past this application has been addressed by SLAM and Structure from motion (SfM) techniques, this radically changed in recent years thanks to the dawn of deep learning. Following this trend, we will introduce a novel approach that combines traditional computer vision techniques with deep learning to perform a view point variant 3D object reconstruction from non-overlapping RGB views.
     
    
      Abstract
      3D Computer vision is a research topic gathering even increasing attention thanks to the more and more widespread availability of off-the-shelf depth sensors and large-scale 3D datasets. The main purpose of 3D computer vision is to understand the geometry of the objects in order to interact
with them. Recently, the success of deep neural networks for processing images has fostered a data driven approach to solve 3D vision problems. Inspired by the potential of this field, in this thesis we will address two main problems: (a) how to leverage machine/deep learning techniques to build a robust and effective pipeline to establish correspondences between surfaces, and (b) how to obtain a reliable 3D reconstruction of an object using RGB images sparsely acquired from different point of views by means of deep neural networks. At the heart of many 3D computer vision applications lies surface matching, an effective paradigm aimed at finding correspondences between points belonging to different shapes. To this end, it is essential to first identify the characteristic points of an object and then create an adequate representation of them. We will refer to these two steps as keypoint detection and keypoint description, respectively. As a first contribution (a) of this Ph.D thesis, we will propose data driven solutions to tackle the problems of keypoint detection and description. As a further interesting direction of research, we investigate the problem of 3D object reconstruction from RGB data only (b). If in the past this application has been addressed by SLAM and Structure from motion (SfM) techniques, this radically changed in recent years thanks to the dawn of deep learning. Following this trend, we will introduce a novel approach that combines traditional computer vision techniques with deep learning to perform a view point variant 3D object reconstruction from non-overlapping RGB views.
     
  
  
    
    
      Tipologia del documento
      Tesi di dottorato
      
      
      
      
        
      
        
          Autore
          Spezialetti, Riccardo
          
        
      
        
          Supervisore
          
          
        
      
        
      
        
          Dottorato di ricerca
          
          
        
      
        
      
        
          Ciclo
          32
          
        
      
        
          Coordinatore
          
          
        
      
        
          Settore disciplinare
          
          
        
      
        
          Settore concorsuale
          
          
        
      
        
          Parole chiave
          3D computer vision; deep learning; surface matching; 3D keypoints detection; 3D keypoints description; canonical orientation; local reference frame; multiview reconstruction; surface registration; relative pose estimation; point cloud; deformable matching
          
        
      
        
          URN:NBN
          
          
        
      
        
          DOI
          10.6092/unibo/amsdottorato/9513
          
        
      
        
          Data di discussione
          6 Novembre 2020
          
        
      
      URI
      
      
     
   
  
    Altri metadati
    
      Tipologia del documento
      Tesi di dottorato
      
      
      
      
        
      
        
          Autore
          Spezialetti, Riccardo
          
        
      
        
          Supervisore
          
          
        
      
        
      
        
          Dottorato di ricerca
          
          
        
      
        
      
        
          Ciclo
          32
          
        
      
        
          Coordinatore
          
          
        
      
        
          Settore disciplinare
          
          
        
      
        
          Settore concorsuale
          
          
        
      
        
          Parole chiave
          3D computer vision; deep learning; surface matching; 3D keypoints detection; 3D keypoints description; canonical orientation; local reference frame; multiview reconstruction; surface registration; relative pose estimation; point cloud; deformable matching
          
        
      
        
          URN:NBN
          
          
        
      
        
          DOI
          10.6092/unibo/amsdottorato/9513
          
        
      
        
          Data di discussione
          6 Novembre 2020
          
        
      
      URI
      
      
     
   
  
  
  
  
  
    
    Statistica sui download
    
    
  
  
    
      Gestione del documento: 
      
        