SITS: A Hierarchical Nonparametric Model using Speaker Identity for Topic Segmentation in Multiparty Conversations

TitleSITS: A Hierarchical Nonparametric Model using Speaker Identity for Topic Segmentation in Multiparty Conversations
Publication TypeJournal Articles
Year of Publication2012
AuthorsNguyen V-A, Boyd-Graber J, Resnik P
JournalAssociation for Computational Linguistics
Date Published2012///
Abstract

One of the key tasks for analyzing conversa- tional data is segmenting it into coherent topic segments. However, most models of topic segmentation ignore the social aspect of con- versations, focusing only on the words used. We introduce a hierarchical Bayesian nonpara- metric model, Speaker Identity for Topic Seg- mentation (SITS), that discovers (1) the top- ics used in a conversation, (2) how these top- ics are shared across conversations, (3) when these topics shift, and (4) a person-specific tendency to introduce new topics. We eval- uate against current unsupervised segmenta- tion models to show that including person- specific information improves segmentation performance on meeting corpora and on po- litical debates. Moreover, we provide evidence that SITS captures an individual’s tendency to introduce new topics in political contexts, via analysis of the 2008 US presidential debates and the television program Crossfire.