Visual Language Processing (VLP) of Ancient Manuscripts: Converting Collections to Windows on the Past

Visual Language Processing (VLP) of Ancient Manuscripts: Converting Collections to Windows on the Past

Place: Large Lecture Room

Affiliation: Syncrhomedia and Livia Labs. École de Technologie Supérieure. Univ. du Québec, Canada.  

Ancient manuscripts, as heritage carriers, are being intensively digitized all over the globe to better preserve them. This has brought a great opportunity to ultimately transliterate and access manuscripts’ content in live-text form. Legible visualization of manuscripts has been also a challenge and a need to access them manually or using recognition systems. Combining these with huge volumes of digitized manuscripts in the collections, several disciplines in natural science as well in humanities can be boosted and challenged toward ultimately driving human effort in the direction of dialog of civilizations and a flourishing future of society.

Manuscript image analysis and understanding faces various challenges including physical degradation, variations in writing styles and typefaces, and also management of countless number of relations among physical patches, images and manuscripts, and at the same time conceptual words, phrases and manuscripts. In addition, large amount of data, and high cost and scarcity of human expert feedback and reference data add to the complexity of the problem.

The problems mentioned earlier are addressed at three main levels. At the image level, we tackle automatic, data-driven enhancement and restoration of document images using spatial and spectral relations, and sparse and graph-based representations among visual objects. At the second level of transliteration, directed graphical models, HMMs, undirected Random Fields and spatial relations are pursued to extract live text of manuscript images while actively leveraging on huge number of samples presented in the collections to reduce dependency on human expert. Finally, at the highest level, spatial- and temporal-aware analysis of huge networks of relations among objects (from patches and words to manuscripts and writers) will be the focus to discover “social” networks among objects and also to identify interactions among societies along time and location. The impact of this Visual Language Processing paradigm is to unlocking a vast number of lifeless digitized manuscripts to better visualize and understand human documented heritage. The multi-level structure of the proposed approach enables researchers and scholars to collaboratively dig, model and interpret the manuscripts.