Self-supervised Learning from Images, and Augmentations

CVC Seminar


In this talk, Yuki M. Asano will talk about pushing the limits of what can be learnt without using any human annotations. After a first overview of what self-supervised learning is, we will first dive into how clustering can be combined with representation learning using optimal transport ([1] @ ICLR'20), a paradigm still relevant in current SoTA models like SwAV/DINO/MSN. Next, I will show how self-supervised clustering can be used for unsupervised segmentation in images ([2] @CVPR'22). Finally, we analyze one of the key ingredients of self-supervised learning, the augmentations. Here, I will show that it is possible to extrapolate to semantic classes such as those of ImageNet or Kinetics using just a single datum as visual input when combined with strong augmentations and a pretrained teacher ([3] @ICLR'23). 


Short bio:

Yuki M. Asano is an assistant professor of computer vision and machine learning at the QUVA lab at the University of Amsterdam. He did his PhD at the Visual Geometry Group at the University of Oxford supervised by Andrea Vedaldi. He is interested in computer vision, self-supervised learning and multi-modal learning, and privacy. He is the main organizer for the ECCV workshop series on "Self-supervised Learning - What is next?" and during his PhD has won the Qualcomm fellowship and interned at Meta AI.