Learning a compositional hierarchy of disparity descriptors for 3D orientation estimation in an active fixation setting

IRIS

Interaction with everyday objects requires by the active visual system a fast and invariant reconstruction of their local shape layout, through a series of fast binocular fixation movements that change the gaze direction on the 3-dimensional surface of the object. Active binocular viewing results in complex disparity fields that, although informative about the orientation in depth (e.g., the slant and tilt), highly depend on the relative position of the eyes. Assuming to learn the statistical relationships between the differential properties of the disparity vector fields and the gaze directions, we expect to obtain more convenient, gaze-invariant visual descriptors. In this work, local approximations of disparity vector field differentials are combined in a hierarchical neural network that is trained to represent the slant and tilt from the disparity vector fields. Each gaze-related cell's activation in the intermediate representation is recurrently merged with the other cells' activations to gain the desired gaze-invariant selectivity. Although the representation has been tested on a limited set of combinations of slant and tilt, the resulting high classification rate validates the generalization capability of the approach.

Learning a compositional hierarchy of disparity descriptors for 3D orientation estimation in an active fixation setting

Kalou, Katerina;Gibaldi, Agostino;Canessa, Andrea;Sabatini, Silvio P.

2017-01-01

Abstract

Interaction with everyday objects requires by the active visual system a fast and invariant reconstruction of their local shape layout, through a series of fast binocular fixation movements that change the gaze direction on the 3-dimensional surface of the object. Active binocular viewing results in complex disparity fields that, although informative about the orientation in depth (e.g., the slant and tilt), highly depend on the relative position of the eyes. Assuming to learn the statistical relationships between the differential properties of the disparity vector fields and the gaze directions, we expect to obtain more convenient, gaze-invariant visual descriptors. In this work, local approximations of disparity vector field differentials are combined in a hierarchical neural network that is trained to represent the slant and tilt from the disparity vector fields. Each gaze-related cell's activation in the intermediate representation is recurrently merged with the other cells' activations to gain the desired gaze-invariant selectivity. Although the representation has been tested on a limited set of combinations of slant and tilt, the resulting high classification rate validates the generalization capability of the approach.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2017
			
	ISBN
	
				9783319686110
			
	Appare nelle tipologie:
	
				04.01 - Contributo in atti di convegno

File in questo prodotto:

File	Dimensione	Formato
2017_Book_ArtificialNeuralNetworksAndMac-217-224.pdf accesso chiuso Tipologia: Documento in versione editoriale Dimensione 644.71 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	644.71 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11567/887521

Citazioni

ND

0

0

social impact