Open Call for european P-SPHERE project researcher (Post Doc): Information Extraction from historical document images<![CDATA[TOPIC DESCRIPTION Digital humanities are an emerging topic. The automatic extraction of information from scanned documents stored in archives and libraries of any modality (printed or handwritten text, graphics) is a challenge. The use of context is very important in improving the recognition, especially when it is handwritten. The structure of objects, language models both for textual and graphical contents, is a kind of syntactic context. There is also another context, the semantic context provided by users which encompasses new modalities of user interaction (sketching interface, collaborative annotation platforms, etc.). To tackle these challenges, the current research objectives of the hosting group are:
- To develop models for indexing and cross-linking visual terms in large scale document collections, building common representation spaces for different data modalities.
- To include syntactic and semantic context in recognition and retrieval architectures (Context aware recognition).
- To construct an advanced collaborative and tangible interface architecture and validate with end users.