Improving text classification for oral history archives with temporal domain knowledge
Title | Improving text classification for oral history archives with temporal domain knowledge |
Publication Type | Conference Papers |
Year of Publication | 2007 |
Authors | Olsson SJ, Oard D |
Conference Name | Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval |
Date Published | 2007/// |
Publisher | ACM |
Conference Location | New York, NY, USA |
ISBN Number | 978-1-59593-597-7 |
Keywords | automatic topic classification, classifying with domain knowledge, spoken document classification |
Abstract | This paper describes two new techniques for increasing the accuracy oftopic label assignment to conversational speech from oral history interviews using supervised machine learning in conjunction with automatic speech recognition. The first, time-shifted classification, leverages local sequence information from the order in which the story is told. The second, temporal label weighting, takes the complementary perspective by using the position within an interview to bias label assignment probabilities. These methods, when used in combination, yield between 6% and 15% relative improvements in classification accuracy using a clipped R-precision measure that models the utility of label sets as segment summaries in interactive speech retrieval applications. |
URL | http://doi.acm.org/10.1145/1277741.1277848 |
DOI | 10.1145/1277741.1277848 |