For an autonomous system to completely understand a particular scene, a 3D reconstruction of the world is required which has both the geometric information such as camera pose and semantic information such as the label associated with an object (tree, chair, dog, etc.) mapped within the 3D reconstruction. In this thesis, we will study the problem of an object-centric 3D reconstruction of a scene in contrast with most of the previous work in the literature which focuses on building a 3D point cloud that has only the structure but lacking any semantic information. We will study how crucial 3D object localization is for this problem and will discuss the limitations faced by the previous related methods. We will present an approach for 3D object localization using only 2D detections observed in multiple views by including 3D object shape priors. Since our first approach relies on associating 2D detections in multiple views, we will also study an approach to re-identify multiple object instances of an object in rigid scenes and will propose a novel method of joint learning of the foreground and background of an object instance using a triplet-based network in order to identify multiple instances of the same object in multiple views. We will also propose an Augmented Reality-based application using Google's Tango by integrating both the proposed approaches. Finally, we will conclude with some open problems that might benefit from the suggested future work.

The role of object instance re-identification in 3D object localization and semantic 3D reconstruction.

BANSAL, VAIBHAV
2020-02-28

Abstract

For an autonomous system to completely understand a particular scene, a 3D reconstruction of the world is required which has both the geometric information such as camera pose and semantic information such as the label associated with an object (tree, chair, dog, etc.) mapped within the 3D reconstruction. In this thesis, we will study the problem of an object-centric 3D reconstruction of a scene in contrast with most of the previous work in the literature which focuses on building a 3D point cloud that has only the structure but lacking any semantic information. We will study how crucial 3D object localization is for this problem and will discuss the limitations faced by the previous related methods. We will present an approach for 3D object localization using only 2D detections observed in multiple views by including 3D object shape priors. Since our first approach relies on associating 2D detections in multiple views, we will also study an approach to re-identify multiple object instances of an object in rigid scenes and will propose a novel method of joint learning of the foreground and background of an object instance using a triplet-based network in order to identify multiple instances of the same object in multiple views. We will also propose an Augmented Reality-based application using Google's Tango by integrating both the proposed approaches. Finally, we will conclude with some open problems that might benefit from the suggested future work.
28-feb-2020
File in questo prodotto:
File Dimensione Formato  
phdunige_4317173_1.pdf

accesso aperto

Descrizione: The first part of the thesis including abstract, acknowledgement, list of Contents, Chapter 1 and Chapter 2.
Tipologia: Tesi di dottorato
Dimensione 8.16 MB
Formato Adobe PDF
8.16 MB Adobe PDF Visualizza/Apri
phdunige_4317173_2.pdf

accesso aperto

Descrizione: The second part of the thesis including Chapter 3 and Chapter 4.
Tipologia: Tesi di dottorato
Dimensione 19.28 MB
Formato Adobe PDF
19.28 MB Adobe PDF Visualizza/Apri
phdunige_4317173_3.pdf

accesso aperto

Descrizione: The third part of the thesis including Chapter 5 and Conclusions & Future work.
Tipologia: Tesi di dottorato
Dimensione 4.73 MB
Formato Adobe PDF
4.73 MB Adobe PDF Visualizza/Apri
phdunige_4317173_4.pdf

accesso aperto

Descrizione: The fourth part of the thesis including References.
Tipologia: Tesi di dottorato
Dimensione 150.77 kB
Formato Adobe PDF
150.77 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11567/997920
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact