The 3rd IWRR workshop was held in Perth, Australia, in conjunction with ACCV2018.

Perth Convention and Exhibition Centre (PCEC) Level 2 Meeting Room 3


Best Paper Award

The 3rd IWRR presents its best paper award to: Michal Busta, Yash Patel and Jiri Matas for their paper entitled: “E2E-MLT - an Unconstrained End-to-End Method for Multi-Language Scene Text”, for its originality of work, writing quality and presentation quality.

Award sponsored by:

madacode_logo_web2.jpg    onlyyou_logo_1x.png    naver_labs_logo.jpg

Scope and Motivation

The workshop aims at bringing together computer vision researchers and practitioners with an interest in reading systems that operate on images acquired in unconstrained conditions, such as scene images and video sequences, born-digital images, wearable camera and lifelog feeds, social media images, etc. The particular focus of the workshop is on the automatic extraction and interpretation of textual content in images, and applications that use textual information obtained automatically by such methods.

There is text in about 50% of the images in large-scale datasets like MS Common Objects in Context (MSCOCO), and the percentage goes up sharply in urban environments, hence ensuring that scene text is properly accounted for in holistic scene interpretation models is not a marginal research problem, but quite central for computer vision.

Interpreting written communication (textual or symbolic) is one of the most important human activities. At the same time it is a difficult task for computers to realise, especially when the textual content is captured in unconstrained conditions. Apart from the scientific interest, a key motivation comes by the plethora of potential applications enabled by automated text reading systems, such as assisted reading for the visually-impaired, automatic translation, robot navigation, and industrial automation to mention just a few.

Moreover, textual content in human environments conveys important high-level semantic information that is not available in any other form in the scene. In this sense, the interplay between textual and visual context can be leveraged to improve scene understanding models for image classification, tagging, retrieval, captioning, or visual question answering.

The workshop aims to offer a forum for researchers to share experiences and latest results in the area, as well as discussing current trends and the evolution of the Robust Reading Competition.

Call for Papers

IWRR2018 invites the submission of original, previously unpublished work and welcomes re-submissions of improved versions of papers that have been rejected in the ACCV2018 conference reviewing process. 

Workshop proceedings with accepted papers will be published along with the main conference proceedings by Springer in the Lecture Notes in Computer Science (LNCS) series.

The topics of interest include among others:

  • Word spotting and end-to-end reading systems

  • Scene text based image retrieval

  • Joint modelling of textual and visual information

  • Text localisation, segmentation, and recognition in scene and born-digital images

  • Reading and tracking scene and/or overlaid text in video sequences

  • Robust reading applications (e.g. translation, reading text for the blind etc)

  • Performance evaluation and metrics

  • Restoration of camera captured documents (dewarping, deblurring, etc.)

  • Quality estimation and degradation modelling of camera-captured text