Holistic sentiment analysis across languages: multilingual supervised latent Dirichlet allocation
Title | Holistic sentiment analysis across languages: multilingual supervised latent Dirichlet allocation |
Publication Type | Conference Papers |
Year of Publication | 2010 |
Authors | Boyd-Graber J, Resnik P |
Conference Name | Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing |
Date Published | 2010/// |
Publisher | Association for Computational Linguistics |
Conference Location | Stroudsburg, PA, USA |
Abstract | In this paper, we develop multilingual supervised latent Dirichlet allocation (MlSLDA), a probabilistic generative model that allows insights gleaned from one language's data to inform how the model captures properties of other languages. MlSLDA accomplishes this by jointly modeling two aspects of text: how multilingual concepts are clustered into thematically coherent topics and how topics associated with text connect to an observed regression variable (such as ratings on a sentiment scale). Concepts are represented in a general hierarchical framework that is flexible enough to express semantic ontologies, dictionaries, clustering constraints, and, as a special, degenerate case, conventional topic models. Both the topics and the regression are discovered via posterior inference from corpora. We show MlSLDA can build topics that are consistent across languages, discover sensible bilingual lexical correspondences, and leverage multilingual corpora to better predict sentiment. |
URL | http://dl.acm.org/citation.cfm?id=1870658.1870663 |