WORKSHOP | Multimodal Foundation Models: from Research to Innovation

On 1 July 2026, the European AI research community will gather in Barcelona for the Workshop on Multimodal Foundation Models: From Research to Innovation. This half-day workshop will explore the latest advances in multimodal artificial intelligence, with a focus on how cutting-edge research can be translated into real-world innovation. Bringing together leading researchers, industry representatives … Read more

Anunci d’una posició de Tècnic/a de Comunicació i Màrqueting per a la Xarxa RDI-IA del Centre de Visió per Computador

Data obertura: 26/06/2026  Ref: 20260626_XARXA  Tècnic/a de comunicació i màrqueting  La Xarxa d’RDI en Intel·ligència Artificial es va articular a finals de l’any 2022 gràcies al finançament de l’AGAUR (Generalitat de Catalunya) i representarà el sector de la Innovació en Intel·ligència Artificial (IA) a Catalunya, amb ànim de consolidar l’ecosistema català d’IA i posicionar el territori com … Read more

POSTDOC POSITION IN MULTIMODAL FOUNDATION MODELS FOR DOCUMENT UNDERSTANDING

Call reference: 20260619_MDU  Closing date: 28/06/2026  We are seeking a postdoc to join the Vision, Language and Reading group at the Computer Vision Center (CVC), in Barcelona, Spain.  The position is initially for 1 year and linked to the project “Multimodal LLMs for Document Undestanding” (MuDocU), financed by the Spanish Ministry of Science.   Proyecto PID2023-146426NB-100 financiado por MICIU/AEI /10.13039/501100011033 y por FEDER, UE.  The successful candidate is expected to participate in large-scale training efforts, research on multimodal pre-training and finetuning methods, and … Read more

RESEARCH ASSISTANT POSITION FOR THE VISION, LANGUAGE AND READING GROUP 

Call reference: 20260601_ELLIOT  Closing date: 15/06/2026  About CVC  The Computer Vision Center (CVC) is a non-profit research center established in 1995 by the Generalitat de Catalunya and the Universitat Autònoma de Barcelona (UAB). Its mission is to carry out cutting-edge research that has the highest international impact in the field of computer vision. It also promotes the transference of knowledge to industry and society.   Computer vision is an … Read more

6th IAPR TC10/TC11 Summer School. Next-Gen Document Understanding: RAG, VLMs, and Structured Knowledge Extraction

From May 25th to 29th, the 6th IAPR TC10/TC11 Summer School. Next-Gen Document Understanding: RAG, VLMs, and Structured Knowledge Extraction) took place in Vall de Núria, Catalonia, bringing together PhD students, researchers, and professionals working in document analysis. Hosted by the Computer Vision Center, the event followed a Challenge-Based Learning framework, where participants progressed from foundational lectures to hands-on practice sessions … Read more

What makes an image beautiful? A new dataset enables the study of visual aesthetics with reduced interpretative interference

Why do some images appear beautiful while others do not? Research in empirical aesthetics has long sought to answer this question by analysing large image datasets and comparing their visual properties with people’s ratings. However, most existing datasets combine two levels of information that are difficult to disentangle: low-level visual features and high-level semantic content. … Read more

6th IAPR TC10/TC11 Summer School on Document Understanding

6th IAPR TC10/TC11 Summer School Next-Gen Document Understanding: RAG, VLMs, and Structured Knowledge Extraction Registration 850 € Includes accommodation, all meals, coffee breaks, transport, and materials Overview The rapid development of Document Intelligence (DI) has transformed traditional Document Analysis and Recognition (DAR) into a sophisticated, AI-driven field. This summer school provides cutting-edge tools for information … Read more

Pedestrian Intention Prediction for Autonomous Driving

Naveed Riad will defended his PhD thesis on May 26, 2026. What is the thesis about? Artificial Intelligence has become a cornerstone of intelligent transportation systems (ITS), where anticipating human behavior has a direct impact on road safety. Pedestrian intention prediction (PIP), which determines whether a pedestrian will cross in front of an ego-vehicle, is … Read more

Categories Phd

HPC@CVC Lessons on Scaling Up: From 1 GPU to 100s | CVC Workshop

🚀 HPC@CVC Workshop Lessons on Scaling Up: From 1 GPU to 100s 📅 Date: May 22, 2026🕘 Time: 09:30 – 14:00📍 Location: CVC Conference Room 👨‍🏫 Speakers 🧠 Abstract From one GPU under your desk to hundreds across Europe’s supercomputers, scaling modern AI is a journey through performance, memory, communication, and infrastructure. This session breaks … Read more