Spatial reasoning, a crucial aspect of human cognition, is essential for interpreting and interacting within a three-dimensional environment. This ability is equally crucial in automated systems, particularly those functioning in dynamic, real-world scenarios where spatial interaction is a fundamental requirement. This thesis focuses on integrating spatial reasoning into computational systems, starting with the explicit incorporation of spatial reasoning through manually encoded rules for robot navigation. Here, the emphasis is on maintaining complete control over the system’s actions, ensuring predictability and transparency in its behavior. Learning-based solutions are then analyzed for cases where the complexity of scenarios makes manual encoding of rules impractical. Specifically, the use of Graph Neural Networks (GNNs) is motivated by their proficiency in handling data with varying components and types, showcasing their effectiveness in tasks like object localization and object reassembly. We demonstrate that, by using GNNs combined with Attention mechanisms and a graph representation of the 3D environment, we are able to achieve remarkable capabilities in representing complex spatial relationships, resulting in state-of-the-art performance in Object Localization in Partial 3D Scenes. Finally, we enhance the spatial reasoning capabilities of GNNs by integrating them with Diffusion Probabilistic Models (DPMs). This approach uses DPMs for iterative solution refinement from random noise, moving beyond single-step predictions. In this thesis, we show that Graph Neural Networks are an extremely versatile and effective solution for solving spatial problems that can be represented using graphs. They are able to solve complex tasks, such as Object Localization in Partial Scenes and Object Reassembly, outperforming other non-GNN-based approaches.

Spatial Reasoning with Graph Neural Networks

GIULIARI, FRANCESCO
2024-03-29

Abstract

Spatial reasoning, a crucial aspect of human cognition, is essential for interpreting and interacting within a three-dimensional environment. This ability is equally crucial in automated systems, particularly those functioning in dynamic, real-world scenarios where spatial interaction is a fundamental requirement. This thesis focuses on integrating spatial reasoning into computational systems, starting with the explicit incorporation of spatial reasoning through manually encoded rules for robot navigation. Here, the emphasis is on maintaining complete control over the system’s actions, ensuring predictability and transparency in its behavior. Learning-based solutions are then analyzed for cases where the complexity of scenarios makes manual encoding of rules impractical. Specifically, the use of Graph Neural Networks (GNNs) is motivated by their proficiency in handling data with varying components and types, showcasing their effectiveness in tasks like object localization and object reassembly. We demonstrate that, by using GNNs combined with Attention mechanisms and a graph representation of the 3D environment, we are able to achieve remarkable capabilities in representing complex spatial relationships, resulting in state-of-the-art performance in Object Localization in Partial 3D Scenes. Finally, we enhance the spatial reasoning capabilities of GNNs by integrating them with Diffusion Probabilistic Models (DPMs). This approach uses DPMs for iterative solution refinement from random noise, moving beyond single-step predictions. In this thesis, we show that Graph Neural Networks are an extremely versatile and effective solution for solving spatial problems that can be represented using graphs. They are able to solve complex tasks, such as Object Localization in Partial Scenes and Object Reassembly, outperforming other non-GNN-based approaches.
29-mar-2024
Graph Neural Networks; Spatial Reasoning; Deep learning
File in questo prodotto:
File Dimensione Formato  
phdunige_4965853_1.pdf

accesso aperto

Descrizione: Cap 1-3
Tipologia: Tesi di dottorato
Dimensione 2.11 MB
Formato Adobe PDF
2.11 MB Adobe PDF Visualizza/Apri
phdunige_4965853_2.pdf

accesso aperto

Descrizione: Cap 4-6
Tipologia: Tesi di dottorato
Dimensione 18.13 MB
Formato Adobe PDF
18.13 MB Adobe PDF Visualizza/Apri
phdunige_4965853_3.pdf

accesso aperto

Descrizione: Appendix
Tipologia: Tesi di dottorato
Dimensione 12.8 MB
Formato Adobe PDF
12.8 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11567/1168696
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact