SITS: A Hierarchical Nonparametric Model using Speaker Identity for Topic Segmentation in Multiparty Conversations
Title | SITS: A Hierarchical Nonparametric Model using Speaker Identity for Topic Segmentation in Multiparty Conversations |
Publication Type | Journal Articles |
Year of Publication | 2012 |
Authors | Nguyen V-A, Boyd-Graber J, Resnik P |
Journal | Association for Computational Linguistics |
Date Published | 2012/// |
Abstract | One of the key tasks for analyzing conversa- tional data is segmenting it into coherent topic segments. However, most models of topic segmentation ignore the social aspect of con- versations, focusing only on the words used. We introduce a hierarchical Bayesian nonpara- metric model, Speaker Identity for Topic Seg- mentation (SITS), that discovers (1) the top- ics used in a conversation, (2) how these top- ics are shared across conversations, (3) when these topics shift, and (4) a person-specific tendency to introduce new topics. We eval- uate against current unsupervised segmenta- tion models to show that including person- specific information improves segmentation performance on meeting corpora and on po- litical debates. Moreover, we provide evidence that SITS captures an individual’s tendency to introduce new topics in political contexts, via analysis of the 2008 US presidential debates and the television program Crossfire. |