This work presents a biologically inspired solution to problems that arise in multisensory attention, with a specific application to binaural humanoid robotics. The focus was on using only two microphones as an analogy to mammalian auditory system. The goal was to localize a salient sound source and to fuse this information with a visual-salience feature selection system. We describe a method to select task-relevant sounds in the auditory scene using both interaural time difference (ITD), inspired by the work of Jeffress and Konishi in the avian auditory system. We modelled the well-studied coincidence detectors of the barn owl as banks of frequency-tuned delay-and-sum beamformers, with frequency decomposition based on a Gammatone model of the human cochlea. Furthermore, we developed a useful metric of auditory salience that emphasized onsets of spectrally complex sounds. Finally, we developed an interface between this auditory-salience based attention orienting system and an existing visual-salience based attention system on the iCub humanoid robot. We demonstrate that the iCub is capable of behaviours that would be impossible without fusion of auditory and visual attention.
Saliency based sensor fusion of broadband sound localizer for humanoids
MOSADEGHZAD, MOHAMAD;REA, FRANCESCO;SANDINI, GIULIO
2015-01-01
Abstract
This work presents a biologically inspired solution to problems that arise in multisensory attention, with a specific application to binaural humanoid robotics. The focus was on using only two microphones as an analogy to mammalian auditory system. The goal was to localize a salient sound source and to fuse this information with a visual-salience feature selection system. We describe a method to select task-relevant sounds in the auditory scene using both interaural time difference (ITD), inspired by the work of Jeffress and Konishi in the avian auditory system. We modelled the well-studied coincidence detectors of the barn owl as banks of frequency-tuned delay-and-sum beamformers, with frequency decomposition based on a Gammatone model of the human cochlea. Furthermore, we developed a useful metric of auditory salience that emphasized onsets of spectrally complex sounds. Finally, we developed an interface between this auditory-salience based attention orienting system and an existing visual-salience based attention system on the iCub humanoid robot. We demonstrate that the iCub is capable of behaviours that would be impossible without fusion of auditory and visual attention.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.