Images are frequently used in electronic documents (Web and email) to embed textual information. The use of images as text carriers stems from a number of needs. For example images are used in order to beautify (e.g. titles, headings etc), to attract attention (e.g. advertisements), to hide information (e.g. images in spam emails used to avoid text-based filtering), even to tell a human apart from a computer (CAPTCHA tests).
Automatically extracting text from born-digital images is therefore an interesting prospect as it would provide the enabling technology for a number of applications such as improved indexing and retrieval of Web content, enhanced content accessibility, content filtering (e.g. advertisements or spam emails) etc.
While born-digital text images are on the surface very similar to real scene text images (both feature text in complex colour settings) at the same time they are distinctly different. Born-digital images are inherently low-resolution (made to be transmitted online and displayed on a screen) and text is digitally created on the image; scene text images on the other hand are high-resolution camera captured ones. While born-digital images might suffer from compression artefacts and severe anti-aliasing they do not share the illumination and geometrical problems of real-scene images. Therefore it is not necessarily true that methods developed for one domain would work in the other (or is it?).
The ICDAR 2011 Competition
In 2011 we set out to find out the state of the art in Text Extraction in both domains (born-digital images and real scene) through a comprehensive Robust Reading Competition that contained challenges on both domains. We strongly encouraged authors interested to participate to submit results to both challenges, with a view to be able to do a meaningful comparison afterwards. For this reason both challenges were structured in a similar way. The Web page of our ICDAR 2011 sister-challenge on Real Scenes can be found here.
The results from the ICDAR competition can be found in the ICDAR proceedings . You can also have a look at the final report here, and the presentation we did during the conference here. Please cite the ICDAR paper  if you refer to this competition.
The ICDAR challenge received submissions from six participants. The Text Localization task was the most popular with 6 methods, with Text Segmentation following up with 3 submissions and Text Recognition receiving just one submission. In addition to the submitted methods, we included Baseline implementations based on commercial software for the tasks of Text Localization and Text Recognition.
Overall, the results show that the submitted methods rank very close to the Baseline ones, indicating that there was no significant break-through in the results of the ICDAR 2011. Please see the full report  and the presentation for details.
Beyond ICDAR 2011
This challenge on "Reading Text in Born-Digital Images (Web and Email)" has been converted to a continuous mode after ICDAR 2011 came to an end. This means that you can register at this Web page at any time, download the datasets and upload new results as you have them. No deadlines!
Once you upload new results for any of the tasks you will automatically get the performance evaluation metrics calculated on them. You can see the comparison with the ICDAR 2011 results, and you can either keep your results "private" (only you can see them) or make them "public" (other users can see them and compare against them). In addition to the overall comparison tables, you can see details image-per-image, and see in what cases in particular your algorithm fails. You can register now, and download the datasets.
- D. Karatzas, S. Robles Mestre, J. Mas, F. Nourbakhsh, P. Pratim Roy , "ICDAR 2011 Robust Reading Competition - Challenge 1: Reading Text in Born-Digital Images (Web and Email)", In Proc. 11th International Conference of Document Analysis and Recognition, 2011, IEEE CPS, pp. 1485-1490. [pdf] [presentation]