Script-Independent Text Line Segmentation in Freestyle Handwritten Documents
Title | Script-Independent Text Line Segmentation in Freestyle Handwritten Documents |
Publication Type | Reports |
Year of Publication | 2006 |
Authors | Li Y, Zheng Y, Doermann D, Jaeger S |
Date Published | 2006/11// |
Institution | University of Maryland, College Park |
Abstract | Text line segmentation in freestyle handwritten documents remains an open document analysis problem. Curvilinear text lines and small gaps between neighboring text lines present a challenge to algorithms developed for machine printed or hand-printed documents. In this paper, we propose a novel approach based on density estimation and a state-of-the-art image segmentation technique, the level set method. From an input document image, we estimate a probability map, where each element represents the probability that the underlying pixel belongs to a text line. The level set method is then exploited to determine the boundary of neighboring text lines by evolving an initial estimate. Unlike most connected component based methods [1, 2], the proposed algorithm does not use any script-specific knowledge. Extensive quantitative experiments on freestyle handwritten documents with diverse scripts, such as Arabic, Chinese, Korean, and Hindi, demonstrate that our algorithm consistently outperforms previous methods [3, 1, 2]. Further experiments show the proposed algorithm is robust to scale change, rotation, and noise. |