Language Models for Semantic Extraction and Filtering in Video Action Recognition

TitleLanguage Models for Semantic Extraction and Filtering in Video Action Recognition
Publication TypeConference Papers
Year of Publication2011
AuthorsTzoukermann E, Neumann J, Kosecka J, Fermüller C, Perera I, Ferraro F, Sapp B, Chaudhry R, Singh G
Conference NameWorkshops at the Twenty-Fifth AAAI Conference on Artificial Intelligence
Date Published2011/08/24/
Abstract

The paper addresses the following issues: (a) how to represent semantic information from natural language so that a vision model can utilize it? (b) how to extract the salient textual information relevant to vision? For a given domain, we present a new model of semantic extraction that takes into account word relatedness as well as word disambiguation in order to apply to a vision model. We automatically process the text transcripts and perform syntactic analysis to extract dependency relations. We then perform semantic extraction on the output to filter semantic entities related to actions. The resulting data are used to populate a matrix of co-occurrences utilized by the vision processing modules. Results show that explicitly modeling the co-occurrence of actions and tools significantly improved performance.

URLhttps://www.aaai.org/ocs/index.php/WS/AAAIW11/paper/viewPaper/3919