Document Analysis


Document Analysis is a discipline that combines image analysis and pattern recognition techniques to process and extract information from documents from different sources. Sources include either raster formats, after scanning paper-based documents, or electronic formats such as ps, HTML, pdf, etc. Research subfields include paper layout analysis, optical character recognition and graphics recognition.

The CVC's Document Analysis group has research and development experience in the following concerns:

  • Symbol recognition
  • Indexing and browsing by graphical content
  • Sketchy interfaces diagrammatic reasoning
  • Visual languages for graphic documents
  • Graphics recognition architectures
  • Reading systems for forms and structured documents
  • Camera-based OCR
  • Fingerprint recognition