Image based typographic analysis of documents
Title | Image based typographic analysis of documents |
Publication Type | Conference Papers |
Year of Publication | 1993 |
Authors | Doermann D, Furuta R |
Conference Name | Document Analysis and Recognition, 1993., Proceedings of the Second International Conference on |
Date Published | 1993/10// |
Keywords | 2D, analysis;, attributes;, based, character, commands;, component, data, description, document, DVI, extraction;, feature, figure, file;, formatting, hierarchical, image, language;, languages;, layout;, line, margins;, page, placement;, processing;, read-order;, relationships;, representation;, spacing;, spatial, structures;, syntax;, synthesis;, typographic, understanding; |
Abstract | An approach to image based typographic analysis of documents is provided. The problem requires a spatial understanding of the document layout as well as knowledge of the proper syntax. The system performs a page synthesis from the stream of formatting commands defined in a DVI file. Since the two-dimensional relationships between document components are not explicit in the page language, the authors develop a representation which preserves the two-dimensional layout, the read-order and the attributes of document components. From this hierarchical representation of the page layout we extract and analyze relevant typographic features such as margins, line and character spacing, and figure placement |
DOI | 10.1109/ICDAR.1993.395624 |