The depth cue is a fundamental piece of information for artificial and living beings who interact with the surrounding environment in order to handle objects and to avoid obstacles: in such situations, the disparity patterns, which arise when agents fixate objects, are vector fields. We propose a biologically-inspired computational model to estimate dense horizontal and vertical disparity maps by exploiting the cortical paradigms of the primate visual system: in particular, we aim to model the disparity sensitivity of the V1-MT visual pathway. The proposed model is based on a first processing stage composed of a bank of spatial band-pass filters and a static nonlinearity, mimicking complex binocular cells. Then, subsequent pooling stages and decoding strategies allow the model to estimate the vector disparity, after having represented it as a population of MT-like units. We assess the proposed model by using standard benchmarking stereo images, the Middlebury dataset, and specific stereo images that have horizontal and vertical disparities, which characterize the stimuli produced by active vision systems. Moreover, we systemically analyze how the different processing stages affect the model performance, and we discuss their implications for the neural modeling.
A Computational Model for the Neural Representation and Estimation of the Binocular Vector Disparity from Convergent Stereo Image Pairs
Chessa, Manuela;Solari, Fabio
2019-01-01
Abstract
The depth cue is a fundamental piece of information for artificial and living beings who interact with the surrounding environment in order to handle objects and to avoid obstacles: in such situations, the disparity patterns, which arise when agents fixate objects, are vector fields. We propose a biologically-inspired computational model to estimate dense horizontal and vertical disparity maps by exploiting the cortical paradigms of the primate visual system: in particular, we aim to model the disparity sensitivity of the V1-MT visual pathway. The proposed model is based on a first processing stage composed of a bank of spatial band-pass filters and a static nonlinearity, mimicking complex binocular cells. Then, subsequent pooling stages and decoding strategies allow the model to estimate the vector disparity, after having represented it as a population of MT-like units. We assess the proposed model by using standard benchmarking stereo images, the Middlebury dataset, and specific stereo images that have horizontal and vertical disparities, which characterize the stimuli produced by active vision systems. Moreover, we systemically analyze how the different processing stages affect the model performance, and we discuss their implications for the neural modeling.File | Dimensione | Formato | |
---|---|---|---|
ChessaSolari19.pdf
accesso chiuso
Tipologia:
Documento in versione editoriale
Dimensione
1.81 MB
Formato
Adobe PDF
|
1.81 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.