Affiliation: Senior Staff Software Engineer Google Inc. USA
Place: Large Lecture Room
Google extracts information from Street View images to build maps. It started with street address numbers, and is now growing to more and more types of information, all extracted with machine learning technology. A recent success has been extraction of street names directly from signs without involving traditional textline finding. As a result we now present the French Street Name Signs (FSNS) dataset and a Tensor Flow model to directly extract the canonical street name. The FSNS dataset is unique in providing up to four views of the same physical sign, and has more than a million training examples. It provides an interesting opportunity to explore the space of design trade-offs between an engineered solution with separate approaches to the sub-problems, like textline finding, and an “end-to-end” solution.
Bio: Ray spent 8 years at HP Labs Bristol, developing the Tesseract OCR engine, including classifier technology, textline finding, visualization tools, distributed test system, OCR for compression. The next 3 years were spent developing HP PrecisionScan: HP’s premier documentoriented scanning software. This was followed by 7 years at Caere Corporation/ScanSoft, rearchitecting Omnipage to make better use of multiple OCR engines in combination, making substantial accuracy improvements between version 10 and version 15. Ray has spent the last 10 years at Google, once again working on Tesseract, to make it a truly multilingual OCR system covering more than 100 languages, and most recent adding LSTM technology to it. Ray has published several papers and patents on topics related to OCR, winning Best Industrial Paper Award at ICDAR 2011, and presented a keynote talk on Tesseract at DRR 2013.