In this work, we present a method for building grounded representations by structuring the sensorimotor data of an agent. The aim is to encode sensory inputs into internal states that describe action-environment couplings, or relations that connect elements in a scene to action concepts. Thus, the environment is represented regarding the sensorimotor integrations required to interact with elements in it. Such representations are acquired for a particular task in which they serve to infer important states and generate control commands accordingly. Representations are learned in an unsupervised process and are assembled into probability distributions to capture uncertainty. That is achieved through variational methods using deep learning and dynamic programming, and an architecture partially inspired by active inference. During an interaction, future representational states and sensorimotor data are actively predicted, while the incoming sensory information is incorporated through prediction error. Results in a navigation task show that situated representations emerge as sensorimotor relations that are interpretable as action concepts, and which allow interpreting important elements in the environment to generate satisfactory actions. We argue that the acquisition of such capabilities relates to the prediction processes enabled by two mechanisms; first, state-transition models explicitly dependent on the generation of control commands, and secondly, a system trained through dynamic programming, which generates further predictions that relate sensory data to expected state changes.

Grounded representations through deep variational inference and dynamic programming

Olier, Juan Sebastian;Marcenaro, Lucio;Regazzoni, Carlo
2017-01-01

Abstract

In this work, we present a method for building grounded representations by structuring the sensorimotor data of an agent. The aim is to encode sensory inputs into internal states that describe action-environment couplings, or relations that connect elements in a scene to action concepts. Thus, the environment is represented regarding the sensorimotor integrations required to interact with elements in it. Such representations are acquired for a particular task in which they serve to infer important states and generate control commands accordingly. Representations are learned in an unsupervised process and are assembled into probability distributions to capture uncertainty. That is achieved through variational methods using deep learning and dynamic programming, and an architecture partially inspired by active inference. During an interaction, future representational states and sensorimotor data are actively predicted, while the incoming sensory information is incorporated through prediction error. Results in a navigation task show that situated representations emerge as sensorimotor relations that are interpretable as action concepts, and which allow interpreting important elements in the environment to generate satisfactory actions. We argue that the acquisition of such capabilities relates to the prediction processes enabled by two mechanisms; first, state-transition models explicitly dependent on the generation of control commands, and secondly, a system trained through dynamic programming, which generates further predictions that relate sensory data to expected state changes.
2017
9781538637159
File in questo prodotto:
File Dimensione Formato  
08329818.pdf

accesso chiuso

Descrizione: Articolo principale
Tipologia: Documento in versione editoriale
Dimensione 705.32 kB
Formato Adobe PDF
705.32 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11567/914777
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? 2
social impact