Audio visual scene analysis using spherical arrays and cameras.
Title | Audio visual scene analysis using spherical arrays and cameras. |
Publication Type | Journal Articles |
Year of Publication | 2010 |
Authors | O'donovan A, Duraiswami R, Zotkin DN, Gumerov NA |
Journal | The Journal of the Acoustical Society of America |
Volume | 127 |
Issue | 3 |
Pagination | 1979 - 1979 |
Date Published | 2010/// |
Abstract | While audition and vision are used together by living beings to make sense of the world, the observation of the world using machines in applications such as surveillance and robotics has proceeded largely separately. We describe the use of spherical microphone arrays as “audio cameras” and spherical array of video cameras as a tool to perform multi‐modal scene analysis that attempts to answer questions such as “Who?,”, “What?,” “Where?,” and “Why?.” Signal processing algorithms to identify the number of people and their identities and to isolate and dereverberate their speech using multi‐modal processing will be described. The use of graphics processor based signal processing allows for real‐time implementation of these algorithms. [Work supported by ONR.] |
URL | http://link.aip.org/link/?JAS/127/1979/3 |
DOI | 10.1121/1.3385079 |