Autonomous learning of disparity-vergence behaviour through distributed coding and population reward: Basic mechanisms and real-world conditioning on a robot stereo head

IRIS

A robotic system implementation that exhibits autonomous learning capabilities of effective control for vergence eye movements is presented. The system, directly relying on a distributed (i.e. neural) representation of binocular disparity, shows a large tolerance to the inaccuracies of real stereo heads and to the changeable environment. The proposed approach combines early binocular vision mechanisms with basic learning processes, such as synaptic plasticity and reward modulation. The computational substrate consists of a network of modeled V1 complex cells that act as oriented binocular disparity detectors. The resulting population response, besides implicit binocular depth cues about the environment, also provides a global signal (i.e. the overall activity of the population itself) to describe the state of the system and thus its deviation from the desired vergence position. The proposed network, by taking into account the modification of its internal state as a consequence of the action performed, evolves following a differential Hebbian rule. The overall activity of the population is exploited to derive an intrinsic signal that drives the weights update. Exploiting this signal implies a maximization of the population activity itself, thus providing an highly effective reward for the developing of a stable and accurate vergence behaviour. The role of the different orientations in the learning process is evaluated separately against the whole population, evidencing that the interplay among the differently oriented channels allows a faster learning capability and a more accurate control. The efficacy of the proposed intrinsic reward signal is thus comparatively assessed against the ground-truth signal (the actual disparity) providing equivalent results, and thus validating the approach. Trained in a simulated environment, the proposed network, is able to cope with vergent geometry and thus to learn effective vergence movements for static and moving visual targets. Experimental tests with real robot stereo pairs demonstrate the capability of the architecture not just to directly learn from the environment, but to adapt the control to the stimulus characteristics.

Autonomous learning of disparity-vergence behaviour through distributed coding and population reward: Basic mechanisms and real-world conditioning on a robot stereo head

GIBALDI, AGOSTINO;CANESSA, ANDREA;SOLARI, FABIO;SABATINI, SILVIO PAOLO

2015-01-01

Abstract

A robotic system implementation that exhibits autonomous learning capabilities of effective control for vergence eye movements is presented. The system, directly relying on a distributed (i.e. neural) representation of binocular disparity, shows a large tolerance to the inaccuracies of real stereo heads and to the changeable environment. The proposed approach combines early binocular vision mechanisms with basic learning processes, such as synaptic plasticity and reward modulation. The computational substrate consists of a network of modeled V1 complex cells that act as oriented binocular disparity detectors. The resulting population response, besides implicit binocular depth cues about the environment, also provides a global signal (i.e. the overall activity of the population itself) to describe the state of the system and thus its deviation from the desired vergence position. The proposed network, by taking into account the modification of its internal state as a consequence of the action performed, evolves following a differential Hebbian rule. The overall activity of the population is exploited to derive an intrinsic signal that drives the weights update. Exploiting this signal implies a maximization of the population activity itself, thus providing an highly effective reward for the developing of a stable and accurate vergence behaviour. The role of the different orientations in the learning process is evaluated separately against the whole population, evidencing that the interplay among the differently oriented channels allows a faster learning capability and a more accurate control. The efficacy of the proposed intrinsic reward signal is thus comparatively assessed against the ground-truth signal (the actual disparity) providing equivalent results, and thus validating the approach. Trained in a simulated environment, the proposed network, is able to cope with vergent geometry and thus to learn effective vergence movements for static and moving visual targets. Experimental tests with real robot stereo pairs demonstrate the capability of the architecture not just to directly learn from the environment, but to adapt the control to the stimulus characteristics.

Scheda breve

Scheda completa

Scheda completa (DC)

Anno

2015

Appare nelle tipologie:

01.01 - Articolo su rivista

File in questo prodotto:

File	Dimensione	Formato
1-s2.0-S0921889015000093-main (1).pdf accesso chiuso Descrizione: Articolo principale Tipologia: Documento in versione editoriale Dimensione 1.12 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	1.12 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11567/778000

Citazioni

ND

9

8

social impact