Dimosthenis Karatzas


Refereed Papers

Semantics-Based Content Extraction in Typewritten Historical Documents

A. Antonacopoulos, D. Karatzas

Proceedings of the 8th International Conference on Document Analysis and Recognition (ICDAR2005), Seoul, Korea, August 29 - September 1, 2005, IEEE, pp. 48-53


This paper presents a flexible approach to extracting content from scanned historical documents using semantic information. The final electronic document is the result of a "digital historical document lifecycle" process, where the expert knowledge of the historian/archivist user is incorporated at different stages. Results show that such a conversion strategy aided by (expert) user-specified semantic information and which enables the processing of individual parts of the document in a specialised way, produces superior (in a variety of significant ways) results than document analysis and understanding techniques devised for contemporary documents.

Full Paper



Valid XHTML 1.0! Valid CSS! Number of visitors since 3 June 2005:
Best viewed in 1024x768 - © 2005-06
Designed by: Christos Papadopoulos - Maintained by: Dimosthenis Karatzas