{"id":220,"date":"2021-09-02T14:22:02","date_gmt":"2021-09-02T12:22:02","guid":{"rendered":"https:\/\/www9.cvc.uab.es\/acmcv\/?page_id=220"},"modified":"2025-09-10T16:46:40","modified_gmt":"2025-09-10T14:46:40","slug":"keynote-talk","status":"publish","type":"page","link":"http:\/\/www.cvc.uab.es\/acmcv\/?page_id=220","title":{"rendered":"Keynote Talk"},"content":{"rendered":"\t\t<div data-elementor-type=\"wp-page\" data-elementor-id=\"220\" class=\"elementor elementor-220\">\n\t\t\t\t\t\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-c33e11c elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"c33e11c\" data-element_type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-a10fc2e\" data-id=\"a10fc2e\" data-element_type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t\t\t<div class=\"elementor-element elementor-element-5b789fd elementor-widget elementor-widget-heading\" data-id=\"5b789fd\" data-element_type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t<style>\/*! elementor - v3.16.0 - 20-09-2023 *\/\n.elementor-heading-title{padding:0;margin:0;line-height:1}.elementor-widget-heading .elementor-heading-title[class*=elementor-size-]>a{color:inherit;font-size:inherit;line-height:inherit}.elementor-widget-heading .elementor-heading-title.elementor-size-small{font-size:15px}.elementor-widget-heading .elementor-heading-title.elementor-size-medium{font-size:19px}.elementor-widget-heading .elementor-heading-title.elementor-size-large{font-size:29px}.elementor-widget-heading .elementor-heading-title.elementor-size-xl{font-size:39px}.elementor-widget-heading .elementor-heading-title.elementor-size-xxl{font-size:59px}<\/style><h2 class=\"elementor-heading-title elementor-size-default\">Enterprise Visual Understanding with Vision-Language Models: From Documents to Intelligent Agents<\/h2>\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-733f379 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"733f379\" data-element_type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-4cc484b\" data-id=\"4cc484b\" data-element_type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t\t\t<div class=\"elementor-element elementor-element-ada9cea elementor-widget elementor-widget-heading\" data-id=\"ada9cea\" data-element_type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t<h2 class=\"elementor-heading-title elementor-size-large\">David V\u00e1zquez<\/h2>\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-c20a024 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"c20a024\" data-element_type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-473d699\" data-id=\"473d699\" data-element_type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t\t\t<div class=\"elementor-element elementor-element-3e96437 elementor-widget elementor-widget-heading\" data-id=\"3e96437\" data-element_type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t<h2 class=\"elementor-heading-title elementor-size-small\">Director of Research Programs at ServiceNow Research<\/h2>\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-6a9b4c4 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"6a9b4c4\" data-element_type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-f5d8d09\" data-id=\"f5d8d09\" data-element_type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t\t\t<div class=\"elementor-element elementor-element-5984744 elementor-widget elementor-widget-text-editor\" data-id=\"5984744\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t<style>\/*! elementor - v3.16.0 - 20-09-2023 *\/\n.elementor-widget-text-editor.elementor-drop-cap-view-stacked .elementor-drop-cap{background-color:#69727d;color:#fff}.elementor-widget-text-editor.elementor-drop-cap-view-framed .elementor-drop-cap{color:#69727d;border:3px solid;background-color:transparent}.elementor-widget-text-editor:not(.elementor-drop-cap-view-default) .elementor-drop-cap{margin-top:8px}.elementor-widget-text-editor:not(.elementor-drop-cap-view-default) .elementor-drop-cap-letter{width:1em;height:1em}.elementor-widget-text-editor .elementor-drop-cap{float:left;text-align:center;line-height:1;font-size:50px}.elementor-widget-text-editor .elementor-drop-cap-letter{display:inline-block}<\/style>\t\t\t\t<p>David V\u00e1zquez is Director of AI Research at ServiceNow Research, where he leads the Fundamental AI Research group. His current work focuses on multimodal learning, vision\u2013language models, reasoning for enterprise applications, web agents, and data-efficient learning. He has published extensively in top venues such as NeurIPS, ICLR, ICML, CVPR, ICCV, and ACL, contributing to advances in multimodal document understanding (BigDocs), chart reasoning (BigCharts), visual content\u2013to\u2013code generation (StarFlow, StarVector), and alignment techniques for VLMs (AlignVLM).<\/p><p>David received degrees in Software Engineering from Universidade da Coru\u00f1a and in Computer Science from the Universitat Aut\u00f2noma de Barcelona (UAB), including a PhD in Computer Vision and AI. He completed postdoctoral fellowships at the Computer Vision Center (CVC) and at MILA under Aaron Courville, funded by a Marie Curie Fellowship. Earlier in his career, he worked on autonomous driving technologies, creating the SYNTHIA dataset, an autonomous driving simulator, and contributing to real vehicle prototypes with a focus on perception (object detection, semantic segmentation, 3D reconstruction, SLAM). He is also an Adjunct Professor at UAB, where he continues to teach and supervise graduate research.<\/p>\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-0b2621c elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"0b2621c\" data-element_type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-4459953\" data-id=\"4459953\" data-element_type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t\t\t<div class=\"elementor-element elementor-element-92908f3 elementor-widget elementor-widget-image\" data-id=\"92908f3\" data-element_type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t<style>\/*! elementor - v3.16.0 - 20-09-2023 *\/\n.elementor-widget-image{text-align:center}.elementor-widget-image a{display:inline-block}.elementor-widget-image a img[src$=\".svg\"]{width:48px}.elementor-widget-image img{vertical-align:middle;display:inline-block}<\/style>\t\t\t\t\t\t\t\t\t\t\t\t<img loading=\"lazy\" decoding=\"async\" width=\"640\" height=\"388\" src=\"http:\/\/www.cvc.uab.es\/acmcv\/wp-content\/uploads\/2025\/09\/Banner-David-Vazquez-1-1024x620.png\" class=\"attachment-large size-large wp-image-1345\" alt=\"\" srcset=\"http:\/\/www.cvc.uab.es\/acmcv\/wp-content\/uploads\/2025\/09\/Banner-David-Vazquez-1-1024x620.png 1024w, http:\/\/www.cvc.uab.es\/acmcv\/wp-content\/uploads\/2025\/09\/Banner-David-Vazquez-1-300x182.png 300w, http:\/\/www.cvc.uab.es\/acmcv\/wp-content\/uploads\/2025\/09\/Banner-David-Vazquez-1-768x465.png 768w, http:\/\/www.cvc.uab.es\/acmcv\/wp-content\/uploads\/2025\/09\/Banner-David-Vazquez-1-1536x930.png 1536w, http:\/\/www.cvc.uab.es\/acmcv\/wp-content\/uploads\/2025\/09\/Banner-David-Vazquez-1-2048x1240.png 2048w\" sizes=\"auto, (max-width: 640px) 100vw, 640px\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-958d42e elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"958d42e\" data-element_type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-fa16550\" data-id=\"fa16550\" data-element_type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t\t\t<div class=\"elementor-element elementor-element-f41ab9e elementor-widget elementor-widget-text-editor\" data-id=\"f41ab9e\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<p>Vision-Language Models (VLMs) have demonstrated remarkable progress in natural image understanding and creative generation, yet their performance often falls short on enterprise-critical tasks such as document analysis, chart reasoning, workflow automation, and user interface navigation. In this talk, will be presented recent advances in adapting multimodal foundation models to enterprise applications, with a focus on text-rich visual understanding, document intelligence, and visual content\u2013to\u2013code generation. Also, will be introduced datasets and benchmarks such as\u00a0BigDocs,\u00a0BigCharts,\u00a0StarFlow, and\u00a0StarVector, designed to push VLMs toward real-world enterprise use cases. It will also be discussed\u00a0AlignVLM, a robust architecture that bridges visual and textual representations to achieve competitive results on challenging document benchmarks. Finally, it will be highlighted how these models enable the next generation of\u00a0AI agents\u2014systems capable of reasoning, planning, and acting\u2014by grounding natural language instructions in complex graphical user interfaces. Together, these directions illustrate a path toward enterprise-ready multimodal AI that is accurate, reliable, and adaptable.<\/p>\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t\t\t\t<\/div>\n\t\t","protected":false},"excerpt":{"rendered":"<p>Enterprise Visual Understanding with Vision-Language Models: From Documents to Intelligent Agents David V\u00e1zquez Director of Research Programs at ServiceNow Research David V\u00e1zquez is Director of AI Research at ServiceNow Research, where he leads the Fundamental AI Research group. His current work focuses on multimodal learning, vision\u2013language models, reasoning for enterprise applications, web agents, and data-efficient [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-220","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"http:\/\/www.cvc.uab.es\/acmcv\/index.php?rest_route=\/wp\/v2\/pages\/220","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/www.cvc.uab.es\/acmcv\/index.php?rest_route=\/wp\/v2\/pages"}],"about":[{"href":"http:\/\/www.cvc.uab.es\/acmcv\/index.php?rest_route=\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"http:\/\/www.cvc.uab.es\/acmcv\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"http:\/\/www.cvc.uab.es\/acmcv\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=220"}],"version-history":[{"count":49,"href":"http:\/\/www.cvc.uab.es\/acmcv\/index.php?rest_route=\/wp\/v2\/pages\/220\/revisions"}],"predecessor-version":[{"id":1365,"href":"http:\/\/www.cvc.uab.es\/acmcv\/index.php?rest_route=\/wp\/v2\/pages\/220\/revisions\/1365"}],"wp:attachment":[{"href":"http:\/\/www.cvc.uab.es\/acmcv\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=220"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}