The depth cue is a fundamental piece of information for artificial and living beings who interact with the surrounding environment in order to handle objects and to avoid obstacles: in such situations, the disparity patterns, which arise when agents fixate objects, are vector fields. We propose a biologically-inspired computational model to estimate dense horizontal and vertical disparity maps by exploiting the cortical paradigms of the primate visual system: in particular, we aim to model the disparity sensitivity of the V1-MT visual pathway. The proposed model is based on a first processing stage composed of a bank of spatial band-pass filters and a static nonlinearity, mimicking complex binocular cells. Then, subsequent pooling stages and decoding strategies allow the model to estimate the vector disparity, after having represented it as a population of MT-like units. We assess the proposed model by using standard benchmarking stereo images, the Middlebury dataset, and specific stereo images that have horizontal and vertical disparities, which characterize the stimuli produced by active vision systems. Moreover, we systemically analyze how the different processing stages affect the model performance, and we discuss their implications for the neural modeling.

A Computational Model for the Neural Representation and Estimation of the Binocular Vector Disparity from Convergent Stereo Image Pairs

Chessa, Manuela;Solari, Fabio
2019

Abstract

The depth cue is a fundamental piece of information for artificial and living beings who interact with the surrounding environment in order to handle objects and to avoid obstacles: in such situations, the disparity patterns, which arise when agents fixate objects, are vector fields. We propose a biologically-inspired computational model to estimate dense horizontal and vertical disparity maps by exploiting the cortical paradigms of the primate visual system: in particular, we aim to model the disparity sensitivity of the V1-MT visual pathway. The proposed model is based on a first processing stage composed of a bank of spatial band-pass filters and a static nonlinearity, mimicking complex binocular cells. Then, subsequent pooling stages and decoding strategies allow the model to estimate the vector disparity, after having represented it as a population of MT-like units. We assess the proposed model by using standard benchmarking stereo images, the Middlebury dataset, and specific stereo images that have horizontal and vertical disparities, which characterize the stimuli produced by active vision systems. Moreover, we systemically analyze how the different processing stages affect the model performance, and we discuss their implications for the neural modeling.
File in questo prodotto:
File Dimensione Formato  
ChessaSolari19.pdf

accesso chiuso

Tipologia: Documento in versione editoriale
Dimensione 1.81 MB
Formato Adobe PDF
1.81 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11567/918980
Citazioni
  • ???jsp.display-item.citation.pmc??? 1
  • Scopus 5
  • ???jsp.display-item.citation.isi??? 5
social impact