Cross-language headline generation for Hindi
Title | Cross-language headline generation for Hindi |
Publication Type | Journal Articles |
Year of Publication | 2003 |
Authors | Dorr BJ, Zajic D, Schwartz R |
Journal | ACM Transactions on Asian Language Information Processing (TALIP) |
Volume | 2 |
Issue | 3 |
Pagination | 270 - 289 |
Date Published | 2003/09// |
ISBN Number | 1530-0226 |
Abstract | This paper presents new approaches to headline generation for English newspaper texts, with an eye toward the production of document surrogates for document selection in cross-language information retrieval. This task is difficult because the user must make decisions about relevance based on (often poor) translations of retrieved documents. To facilitate the decision-making process we need translations that can be assessed rapidly and accurately; our approach is to provide an English headline for the non-English document. We describe two approaches to headline generation and their application to the recent DARPA TIDES-2003 Surprise Language Exercise for Hindi. For comparison, we also implemented an alternative method for surrogate generation: a system that produces topic lists for (Hindi) articles. We present the results of a series of experiments comparing each of these approaches. We demonstrate in both automatic and human evaluations that our linguistically motivated approach outperforms two other surrogate-generation methods: a statistical system and a topic discovery system. |
URL | http://doi.acm.org/10.1145/979872.979878 |
DOI | 10.1145/979872.979878 |