A General Framework for Text Line Detection and Recognition

Abstract: I will start by introducing DTLR, our general approach for recognizing text lines, whether printed (OCR) or handwritten (HTR), using Latin, Chinese, or ciphered characters. Most HTR methods have focused on autoregressive decoding, which predicts characters one at a time. In contrast, DTLR processes the entire line at once. Our method shows strong results … Read more

Marrying Multi-view Geometry with Deep Priors for Image-based 3D Reconstruction

Abstract: We live in a world where all interactions with the environment necessitate a 3D understanding of our surroundings. While humans excel at reasoning about 3D structures from both multi-view and single-view images, replicating this capability in computers remains challenging due to the need to combine mathematically proven geometric knowledge with end-to-end learned priors. In … Read more

Computer Vision Group at the UvA

Abstract: This presentation provides a short overview of the research conducted by the computer vision research group in Amsterdam (UvA – University of Amsterdam) featuring various research projects and several commercially applied use cases.

Vision-Language Contrastive Models: Generalizing Semantic Segmentation and Studying Model Dynamics

Abstract: This presentation is divided into two interconnected parts, exploring the applications and inner workings of Vision-Language contrastive models, with a focus on CLIP-like architectures. In the first part, we examine the application of these models to semantic segmentation tasks. We begin with a brief overview of our previous work in semantic segmentation and multi-branched … Read more

Study and modeling of user behavior and attention in immersive environments.

Abstract: Virtual reality (VR) is a rapidly expanding medium that presents both challenges and opportunities. As VR techniques and applications continue to advance, it becomes increasingly important to create immersive experiences that can fully exploit its potential. Understanding and predicting human visual behavior and user attention is an essential factor in achieving this goal. This … Read more

Benchmarking and Optimizing Gradient-Based Adversarial Attacks for ML Security.

Abstract: Adversarial attacks exploit vulnerabilities in machine learning models by introducing subtle perturbations to input data, leading to incorrect predictions. Rigorous testing of machine learning models against these attacks is often impractical for modern deep learning systems. For these reasons, empirical methods, optimizing adversarial perturbations via gradient descent, are often used to provide robustness evaluations. … Read more

“Hey GPT, please diagnose this histology slide”

Abstract: Pathology is the medical specialty at the core of disease understanding, diagnosis and patient management, but suffers from subjective quantifications, shortage of pathologists and workload increases due to the rise of cancer incidence.Digital pathology (digitizing histology tissue sections into images) enables the use of AI in pathology image analysis, a field known as Computational … Read more

Multimedia Retrieval Strategies in Videos and the Metaverses Frontier

Doctor Giuseppe Serra Seminar

Abstract: Every day, we find ourselves immersed in the age of data. Every hour, vast amounts of media content flood social media and user-generated platforms. For instance, over 500 hours of video are uploaded to YouTube every minute, as of February 2020. Meanwhile, the metaverse is gaining popularity, boasting approximately 400 million monthly users with … Read more

Two complementary perspectives to continual learning: ask not only what to optimize, but also how

Abstract: Continually learning from a stream of non-stationary data is difficult for deep neural networks. When these networks are trained on something new, they tend to quickly forget what was learned before. In recent years, considerable progress has been made towards overcoming such “catastrophic forgetting”, predominantly thanks to approaches that add replay or regularization terms … Read more

Image quality assessment in practical – exploring IQA in dynamic scenes

Doctor Shaolin Su Seminar

Abstract: Image quality assessment (IQA) methods have been long desired to enhance the performance ofcomputer vision related tasks, such as image restoration, image compression and image synthesis. However, the variation in image contents and complexity in image distortions are hindering IQA methods from being applied in practical usages. In this presentation, we explore to develop … Read more