Postdoc Position in Multimodal Foundation Models for Document Understanding
Closing date: Until position is filled
We are seeking a postdoc to join the Vision, Language and Reading group at the Computer Vision Center (CVC), in Barcelona, Spain.
The position is initially for 3 years and linked to the “European Large Open Multi-Modal Foundation Models For Robust Generalization On Arbitrary Data Streams” (ELLIOT), a European Project funded by Horizon Europe and backed by the ELLIS network of excellence. The project targets the development of the next generation of open Multimodal Foundation Models and further adapting them for specific downstream tasks.
The successful candidate is expected to participate in large-scale training efforts, research on finetuning methods, and applications on the specific use case of Document Understanding.
PROJECT PI & HOSTING GROUP
The direct responsible for this post will be Dr Dimosthenis Karatzas and Dr Ernest Valveny, members of the Vision, Language and Reading research group at the CVC. For more information visit: http://vlr.cvc.uab.es/
CANDIDATE ’S PROFILE
The candidate should possess a PhD in machine learning or computer vision and have a strong publication record. We are looking for candidates who have publications in top conferences like ICDAR, CVPR, ECCV, ICCV, AAAI, NeurIPS.
The candidate should have a strong background in Large Language Models and experience in the document image analysis field. The applicants are expected to be fluent in both oral and written communication in English. They should work well in a team while demonstrating initiative and independence. The candidate is expected to co-supervise PhD students.
THE COMPUTER VISION CENTER
The selected candidate will work in the Computer Vision Centre (CVC), Barcelona, a research institute comprising more than 150 researchers and support staff, dedicated to computer vision research and knowledge transfer. With a strong international projection and links to the industry, the Computer Vision Centre offers an exciting environment for scientific career development. The Computer Vision Centre has a plan for expansion of its permanent research staff base and has received the “HR Excellence in Research” award as a provider and supporter of a stimulating and favourable working environment.
Barcelona is a vibrant city and an important Artificial Intelligence hub. The high quality of life is combined with an open and international looking character of the city. Barcelona is very well connected by air, sea and ground transportation. The region of Catalonia boosts its own AI strategy, in which the CVC is a key player.
APPLICATION PROCESS
All applications must be sent through the online form indicating the following offer code: 20251002_ELLIOT
The application process will remain open until a suitable candidate is selected.
OTM-R principles for selection processes
The CVC is committed to Open Transparent and Merit-based Recruitment (OTM-R) for any potential candidate in all our processes. In 2015 we received the Human Resources Strategy for Researchers (HRS4R) award. Through an extensive and continuous process, we improve the conditions and opportunities at CVC. With these actions, the CVC is committed to the principles of the European Charter for Researchers, as well as the Code of Conduct for the Recruitment of Researchers. For more information follow this link.
RESEARCH CONTACT
If you are interested in the position, please contact Dr Dimosthenis Karatzas (dimos@cvc.uab.es) or Dr. Ernest Valveny (ernest@cvc.uab.es) for more information
WEB SITES
Vision, Language and Reading group
“This project has received funding from the European Union’s Horizon Europe research and innovation programme under grant agreement No 101214398”.








