With the advancement of Transformers, especially as far as computer vision is concerned, we are starting to apply end-to-end neural networks, without OCR or other pre-/postprocessing techniques, to the challenges of document understanding and information extraction. I will present developments in this area and discuss potential problems from both theoretical (handling longer sequences) and practical (applying end2end systems in business-oriented systems) perspectives.
The discussion will be presented against the longer timeline of results obtained in EU-financed projects “Robotization of text-based business processes using artificial intelligence methods and deep neural networks”, “A universal platform for robotic automation of processes requiring text comprehension, with a unique level of implementation and service automation”, “Disruptive adoption of Neural Language Modelling for automation of text-intensive work”, and “Hiper-OCR — an innovative solution for information extraction from scanned documents”
Filip Graliński is a professor at the Department of Artificial Intelligence, Adam Mickiewicz University in Poznań, Poland and Chief Data Scientist at Applica.ai. His main research interest is computational linguistics, that is, creating and processing diachronic corpora, natural language processing, syntactic analysis, lexicography, machine translation, named entity recognition and extracting linguistic data from the Internet.