Single-document and multi-document summarization techniques for email threads using sentence compression
Title | Single-document and multi-document summarization techniques for email threads using sentence compression |
Publication Type | Journal Articles |
Year of Publication | 2008 |
Authors | Zajic D, Dorr BJ, Jimmy Lin |
Journal | Information Processing & Management |
Volume | 44 |
Issue | 4 |
Pagination | 1600 - 1610 |
Date Published | 2008/07// |
ISBN Number | 0306-4573 |
Keywords | Email summarization, Enron, Informal media, Sentence compression, Trimming |
Abstract | We present two approaches to email thread summarization: collective message summarization (CMS) applies a multi-document summarization approach, while individual message summarization (IMS) treats the problem as a sequence of single-document summarization tasks. Both approaches are implemented in our general framework driven by sentence compression. Instead of a purely extractive approach, we employ linguistic and statistical methods to generate multiple compressions, and then select from those candidates to produce a final summary. We demonstrate these ideas on the Enron email collection – a very challenging corpus because of the highly technical language. Experimental results point to two findings: that CMS represents a better approach to email thread summarization, and that current sentence compression techniques do not improve summarization performance in this genre. |
URL | http://www.sciencedirect.com/science/article/pii/S0306457307001768 |
DOI | 10.1016/j.ipm.2007.09.007 |