close

Conferences

ConferencesCVC NewsEvents & Community

NeuroBiT group at ECVP 2019

CVC NeuroBit group presented a talk on colour induction and three posters about computational modelling of V1, visual saliency stimulus generation and symmetry detection at the 42nd edition of the European Conference on Visual Perception (ECVP 2019), which took place in Leuven, Belgium, from August 25th – 29th, 2019.

The work presented at this year’s ECVP was the following:

Modelling symmetry perception with banks of quadrature convolutional Gabor kernels by authors Alejandro Párraga, Xavier Otazu and Arash Akbarinia.

A multilayer computational model of the parvocellular pathway in V1 by authors Xim Cerdà-Company, Xavier Otazu and Olivier Panacchio.

Generating Synthetic Images for Visual Attention Modeling, by authors David Berga, Xosé R. Fdez-Vidal, Xavier Otazu, Xosé M. Pardo and Victor Leborán.

Is color assimilation only due to a luminance-chromatic interaction? by authors Xavier Otazu and Xim Cerdà-Company.

read more
ConferencesCVC NewsEvents & Community

NeuroBiT group at the Iberian Conference on Perception 2019

WhatsApp Image 2019-06-25 at 11.16.24

CVC NeuroComputation and Biological Vision Team (NeuroBiT) gave three talks and presented one poster at the 8th Iberian Conference on Perception, which was held from the 20th to the 22nd of June in San Lorenzo de El Escorial, Madrid, Spain.

This conference is focused on Perception, emphasizing different aspects like: Motion Perception, Spatial Vision,Stereopsis, Color Perception, Perception and Action, Attention and Cognition, Auditory Perception, Multisensory Integration and Reading/Speech Perception.

On Thursday 20th of June, Dr. Alejandro Párraga presented the poster “Modelling symmetry perception with banks of quadrature convolutional Gabor kernels”. On the other hand, our PhD student David Berga gave two talks: “Computational modeling of visual attention: What do we know from physiology and psychophysics?” and “Measuring bottom-up visual attention in eye tracking experimentation with synthetic images”. Last but not least, Dr. Xavier Otazu gave the talk No chromatic-chromatic interaction in colour assimilation” and he was also the organizer of the conference’s symposium “Computational Perception”.

David Berga at the Iberian Conference on Perception 2019
Dr. Xavier Otazu giving a talk at the Iberian Conference on Perception 2019
Dr. Alejandro Párraga presenting his poster at the Iberian Conference on Perception 2019
read more
ConferencesCVC NewsEvents & Community

Computer Vision Catalan Alliance at CVPR2019

IMG_8318

A total of 11 papers from Catalan universities and research centers have been accepted at this year’s Conference on Computer Vision and Pattern Recognition (CVPR), one of the most important conferences in the field of Computer Vision.

Thereby, researchers from different universities of Catalonia, including the Polytechnic University of Catalonia (UPC), the Barcelona Supercomputing Center (BSC-CNS), the Pompeu Fabra University (UPF), the Open University of Catalonia (UOC), the University of Barcelona (UB), the Computer Vision Center (CVC) and the Autonomous University of Barcelona (UAB), have presented their most cutting-edge work on Computer Vision, whether in oral or in poster presentations.

This fulfilling amount of presented papers highlights the great commitment of Catalonia to Computer Vision and the quality of its research.

The papers from Catalan institutions presented at CVPR 2019 are the following:

A Dataset and Benchmark for Large-scale Multi-modal Face Anti-Spoofing, by authors Shifeng Zhang (LPR, CASIA, UCAS), Xiaobo Wang (JD AI Research), Ajian Liu (MUST), Chenxu Zhao (JD AI Research), Jun Wan (NLPR, CASIA, UCAS), Sergio Escalera (UB/CVC), Hailin Shi (JD AI Research), Zezheng Wang (JD Finance), Stan Z. Li (NLPR, CASIA, UCAS/MUST)

Convolutional Neural Networks Deceived by Visual Illusions, by authors Alexander Gomez-Villa (UPF), Adrián Martín (UPF), Javier Vazquez-Corral (UEA), Marcelo Bertalmío (UPF)

Deep single Image Camera Calibration with Radical Distortion, by authors Manuel López-Antequera (Mapillary), Roger Marí (CMLA), Pau Gargallo (Mapillary), Yubin Kuang (Mapillary), Javier Gonzalez-Jimenez (Universidad de Málaga), Gloria Haro (UPF)

Doodle to Search: Practical Zero-Shot Sketch-based Image Retrieval, by authors Sounak Dey (CVC), Pau Riba (CVC), Anjan Dutta (CVC), Josep Llados (CVC/UAB), Yi-Zhe Song (University of Surrey)

Good News, Everyone! Context driven entity-aware captioning for news images, by authors Ali Furkan Biten (CVC), Lluis Gomez (CVC), Marçal Rusiñol (CVC), Dimosthenis Karatzas (CVC/UAB).

Inverse Cooking: Recipe Generation from Food Images, by authors Amaia Salvador (UPC), Michal Drozdzal (Facebook AI Research), Xavier Giro-i-Nieto (UPC), Adriana Romero (Facebook AI Research)

Learning Metrics from Teachers: Compact Networks for Image Embedding, by authors Lu Yu (CVC), Vacit Oguz Yazici (CVC/Wide-Eyes Technologies), Xialei Liu (CVC), Joost van de Weijer (CVC), Yongmei Cheng (NPU), Arnau Ramisa (Wide-Eyes Technologies)

LSTA: Long Short-Term Attention for Egocentric Action Recognition, by authors Swathikiran Sudhakaran (Fondazione Bruno Kessler/University of Trento), Sergio Escalera (UB/CVC), Oswald Lanz (Fondazione Bruno Kessler).

RVOS: End-to-End Recurrent Net for Video Object Segmentation, by authors Carles Ventura (UOC), Miriam Bellver (BSC), Andreu Girbau (UPC), Amaia Salvador (UPC), Ferran Marques (UPC), Xavier Giro-i-Nieto (UPC).

Semantically Tied Paired Cycle Consistency for Zero-Shot Sketch-based Image Retrieval by authors Anjan Dutta (CVC), Zeynep Akata (University of Amsterdam)

What Does It Mean to Learn in Deep Networks? And, How Does One Detect Adversarial Attacks?, by authors Ciprian A. Corneanu (UB), Meysam Madadi (CVC/UB), Sergio Escalera (CVC/UB), Aleix M. Martinez (OSU).

This year, the conference has taken place from the 16th to the 20th of June at Long Beach, California.

read more
ConferencesCVC NewsEvents & Community

CVC Researchers at CVPR 2019

This year, CVC presented a total of 5 papers at the annual Computer Vision and Pattern Recognition conference (CVPR). The conference was held from the 16th to the 20th of June in Long Beach, California.

The work presented at this year’s CVPR was the following:

Orals:

Doodle to Search: Practical Zero-Shot Sketch-based Image Retrieval

Sounak Dey, Pau Riba, Anjan Dutta, Josep Llados and Yi-Zhe Song.

Tuesday 18th of June at 13:53h – 1330 – 1520   Oral 1.2A

In this paper, we investigate the problem of zero-shot sketch-based image retrieval (ZS-SBIR), where human sketches are used as queries to conduct retrieval of photos from unseen categories. We importantly advance prior arts by proposing a novel ZS-SBIR scenario that represents a firm step forward in its practical application. The new setting uniquely recognizes two important yet often neglected challenges of practical ZS-SBIR, (i) the large domain gap between amateur sketch and photo, and (ii) the necessity for moving towards large-scale retrieval. We first contribute to the community a novel ZS-SBIR dataset, QuickDraw-Extended, that consists of 330,000 sketches and 204,000 photos spanning across 110 categories. Highly abstract amateur human sketches are purposefully sourced to maximize the domain gap, instead of ones included in existing datasets that can often be semi-photorealistic. We then formulate a ZS-SBIR framework to jointly model sketches and photos into a common embedding space. A novel strategy to mine the mutual information among domains is specifically engineered to alleviate the domain gap. External semantic knowledge is further embedded to aid semantic transfer. We show that, rather surprisingly, retrieval performance significantly outperforms that of state-of-the-art on existing datasets that can already be achieved using a reduced version of our model. We further demonstrate the superior performance of our full model by comparing with a number of alternatives on the newly proposed dataset. The new dataset, plus all training and testing code of our model, will be publicly released to facilitate future research

Posters

Learning Metrics from Teachers: Compact Networks for Image Embedding

Lu Yu, Vacit Oguz Yazici, Xialei Liu, Joost van de Weijer, Yongmei Cheng and Arnau Ramisa

Tuesday 18th of June at 15:20h – 1520 – 1800   Poster 1.2 – #43

Zero-shot sketch-based image retrieval (SBIR) is an emerging task in computer vision, allowing to retrieve natural images relevant to sketch queries that might not been seen in the training phase. Existing works either require aligned sketch-image pairs or inefficient memory fusion layer for mapping the visual information to a semantic space. In this work, we propose a semantically aligned paired cycle-consistent generative (SEM-PCYC) model for zero-shot SBIR, where each branch maps the visual information to a common semantic space via an adversarial training. Each of these branches maintains a cycle consistency that only requires supervision at category levels, and avoids the need of highly-priced aligned sketch-image pairs. A classification criteria on the generators’ outputs ensures the visual to semantic space mapping to be discriminating. Furthermore, we propose to combine textual and hierarchical side information via a feature selection auto-encoder that selects discriminating side information within a same end-to-end model. Our results demonstrate a significant boost in zero-shot SBIR performance over the state-of-the-art on the challenging Sketchy and TU-Berlin datasets.

Doodle to Search: Practical Zero-Shot Sketch-based Image Retrieval

Sounak Dey, Pau Riba, Anjan Dutta, Josep Llados and Yi-Zhe Song.

Tuesday 18th of June at 15:20h – 1520 – 1800   Poster 1.2 – #22 

In this paper, we investigate the problem of zero-shot sketch-based image retrieval (ZS-SBIR), where human sketches are used as queries to conduct retrieval of photos from unseen categories. We importantly advance prior arts by proposing a novel ZS-SBIR scenario that represents a firm step forward in its practical application. The new setting uniquely recognizes two important yet often neglected challenges of practical ZS-SBIR, (i) the large domain gap between amateur sketch and photo, and (ii) the necessity for moving towards large-scale retrieval. We first contribute to the community a novel ZS-SBIR dataset, QuickDraw-Extended, that consists of 330,000 sketches and 204,000 photos spanning across 110 categories. Highly abstract amateur human sketches are purposefully sourced to maximize the domain gap, instead of ones included in existing datasets that can often be semi-photorealistic. We then formulate a ZS-SBIR framework to jointly model sketches and photos into a common embedding space. A novel strategy to mine the mutual information among domains is specifically engineered to alleviate the domain gap. External semantic knowledge is further embedded to aid semantic transfer. We show that, rather surprisingly, retrieval performance significantly outperforms that of state-of-the-art on existing datasets that can already be achieved using a reduced version of our model. We further demonstrate the superior performance of our full model by comparing with a number of alternatives on the newly proposed dataset. The new dataset, plus all training and testing code of our model, will be publicly released to facilitate future research

Semantically Tied Paired Cycle Consistency for Zero-Shot Sketch-based Image Retrieval

Anjan Dutta and Zeynep Akata

Wednesday 19th of June at 10:00h – 1000 – 1245   Poster 2.1 – #54

Zero-shot sketch-based image retrieval (SBIR) is an emerging task in computer vision, allowing to retrieve natural images relevant to sketch queries that might not been seen in the training phase. Existing works either require aligned sketch-image pairs or inefficient memory fusion layer for mapping the visual information to a semantic space. In this work, we propose a semantically aligned paired cycle-consistent generative (SEM-PCYC) model for zero-shot SBIR, where each branch maps the visual information to a common semantic space via an adversarial training. Each of these branches maintains a cycle consistency that only requires supervision at category levels, and avoids the need of highly-priced aligned sketch-image pairs. A classification criteria on the generators’ outputs ensures the visual to semantic space mapping to be discriminating. Furthermore, we propose to combine textual and hierarchical side information via a feature selection auto-encoder that selects discriminating side information within a same end-to-end model. Our results demonstrate a significant boost in zero-shot SBIR performance over the state-of-the-art on the challenging Sketchy and TU-Berlin datasets.

What does it mean to learn in deep networks? And, how does on detect adversarial attacks?

Ciprian A. Corneanu, Meysam Madadi, Sergio Escalera and Aleix M. Martinez.

Wednesday 19th of June at 10:00h – 1000 – 1245   Poster 2.1 – #20

The flexibility and high-accuracy of Deep Neural Networks (DNNs) has transformed computer vision. But, the fact that we do not know when a specific DNN will work and when it will fail has resulted in a lack of trust. A clear example is self-driving cars; people are uncomfortable sitting in a car driven by algorithms that may fail under some unknown, unpredictable conditions. Interpretability and explainability approaches attempt to address this by uncovering what a DNN models, i.e., what each node (cell) in the network represents and what images are most likely to activate it. This can be used to generate, for example, adversarial attacks. But these approaches do not generally allow us to determine where a DNN will succeed or fail and why . i.e., does this learned representation generalize to unseen samples? Here, we derive a novel approach to define what it means to learn in deep networks, and how to use this knowledge to detect adversarial attacks. We show how this defines the ability of a network to generalize to unseen testing samples and, most importantly, why this is the case.

Good News, Everyone! Context driven entity-aware captioning for news images

Ali Furkan Biten, Lluis Gomez, Marçal Rusiñol and Dimosthenis Karatzas

Thursday 20th of June at 15:20h – 1520 – 1800   Poster 3.2 – #188   

Current image captioning systems perform at a merely descriptive level, essentially enumerating the objects in the scene and their relations. Humans, on the contrary, interpret images by integrating several sources of prior knowledge of the world. In this work, we aim to take a step closer to producing captions that offer a plausible interpretation of the scene, by integrating such contextual information into the captioning pipeline. For this we focus on the captioning of images used to illustrate news articles. We propose a novel captioning method that is able to leverage contextual information provided by the text of news articles associated with an image. Our model is able to selectively draw information from the article guided by visual cues, and to dynamically extend the output dictionary to out-of-vocabulary named entities that appear in the context source. Furthermore we introduce `GoodNews’, the largest news image captioning dataset in the literature and demonstrate state-of-the-art results.

read more
ConferencesCVC News

A new symposium claims the importance of Barcelona in Artificial Intelligence

unnamed

Global experts in Deep Learning will meet in Barcelona on December 20 and 21.

Barcelona will host on December 20 and 21 a new symposium on Deep Learning, Deep Learning Barcelona Symposium (DLBCN), one of the driving forces of the current technological revolution around Artificial Intelligence and Computer Vision. The symposium brings together top-level researchers who are either currently developing their research in Barcelona, ​​or have pursued part of their academic career in Barcelona.

Among these Oriol Vinyals stands out as one of the world’s leading experts in Deep Learning, currently at Google Deepmind, or Antonio Torralba, director of the new program that MIT has designed to understand intelligence.

This unique meeting brings together multidisciplinary researchers in the field of Deep Learning and it shows the potential of Barcelona to become the hub of AI in Southern Europe.

The symposium is organized by leading universities and local research centers in the field of Deep Learning.

Further information: http://deeplearning.barcelona.

read more
ConferencesCVC News

3 CVC Papers accepted at this year’s NeurIPS

_DSC6422

CVC researchers have three accepted papers at this year’s Neural Information Processing Systems (NIPS) Conference which will be taking place in Montreal from the 2nd to the 8th of December 2018. Papers are the following:

Memory Replay GANs: learning to generate images from new categories without forgetting, by authors Chenshen Wu, Luis Herranz, Xialei Liu, Yaxing Wang, Dr. Joost van de Weijer and Dr. Bogdan Raducanu (all from CVC);

Image-to-image translation for cross-domain disentanglement‘, by authors Dr. Abel González, Dr. Joost van de Weijer and Dr. Yoshua Bengio;

TADAM: Task dependent adaptive metric for improved few-shot learning‘, by authors Boris N. Oreshkin, Pau Rodriguez (CVC member) and Dr. Alexandre Lacoste

 

read more
ConferencesCVC News

CVC researchers at this year’s ECCV2018

Several CVC researchers attended this year’s European Conference on Computer Vision (ECCV) that took place in Munich, Germany, from the 8 to 14th of September. CVC presented 6 papers in total to the main conference, and several other papers at the Conference’s Workshops:

Dr. Lluís Gómez, Dr. Marçal Rusiñol and Andrés Mafla presented their paper ‘Single Shot Scene Text Retrieval‘, Yaxing Wang presented his ‘Transferring GANs: generating images from limited data‘, Pau Rodriguez presented ‘Attend and Rectify: a Gated Attention Mechanism for Fine-Grained Recovery‘ and Dr. Sergio Escalera presented two papers: ‘Folded Recurrent Neural Networks for Future Video Prediction‘ and ‘Deep Structure Inference Network for Facial Action Unit Recognition‘ along with Dr. Meysam Madadi. Dr. Antonio López and Felipe Codevilla presented their paper On Offline Evaluation of Vision-based Driving Models

In other hand, Raúl Gómez presented a poster and an oral presentation of his work ‘Learning to Learn from Web Data‘ at the ECCV 1st Multimodal Learning and Applications Workshop, Dr. Antonio López presented a demo of the CARLA simulator and Dena Bazazian was one of the selected Phd students to organise the Women in Computer Vision ECCV 2018 Workshop, and also presented a poster at the Epic Workshop her posterSoft-PHOC Descriptor for End-to-End Word Spotting in Egocentric Scene Images.

Have a look at our ECCV2018 Twitter Moment.

Related articles: 

6 CVC Papers Accepted At This Year’s ECCV

Dena Bazazian, Organiser Of The Women In Computer Vision ECCV 2018 Workshop

 

read more
ConferencesCVC News

CVC researchers at this year’s ICPR2018

Several CVC researchers attended this year’s International Conference on Pattern Recognition in Beijing during August. CVC presented 5 papers in total: two oral presentations and three posters.

Xialei Liu presented an oral presentation of his paper ‘Rotate your Networks: Better Weight Consolidation and Less Catastrophic Forgetting‘; Pau Riba presented an oral presentation of his paperLearning Graph Distances with Message Passing Neural Networks‘, wining the Best student Scientific Paper Award of Track 5 on Document Analysis and Recognition. 

On the other hand, Lu Yu presented her poster ‘Weakly Supervised Domain-Specific Color Naming Based on Attention‘, Gemma Rotger presented hers: ‘2D-to-3D Facial Expression Transfer, and Sounak Dey presented ‘Learning Cross-Modal Deep Embeddings for Multi-Object Image Retrieval using Text and Sketch

 

 

read more
ConferencesCVC News

6 CVC papers accepted at this year’s ECCV

_DSC5434

6 CVc papers have been accepted at this year’s European Conference on Computer Vision (ECCV) that will take place in Munich from the 8 to the 14th of September. Most of the papers aren’t accessible yet, and we will be publishing them as they become public.

For now, we only have two available paper and several temptative titles:

Folded Recurrent Neural Networks for Future Video Prediction

Deep Structure Inference Network for Facial Action Unit Recognition

Single Shot Scene Text Retrieval

Transferring GANs: generating images from limited data

Attend and Rectify: a Gated Attention Mechanism for Fine-Grained Recovery

On Offline Evaluation of Vision-based Driving Models

read more
1 2
Page 1 of 2