Upcoming PhD defence
Albin Soutif–Cormerais will defend his PhD thesis on May 23, 2024 11 h.
What is the thesis about?
The use of deep learning has become increasingly popular in the last years in many application fields such as the ones of computer vision and natural language processing. Most of the tasks in these fields are now tackled more efficiently by deep learning than by more classical techniques, provided that enough data is available. However, deep learning algorithms still lack a crucial property, they are not able to efficiently accumulate new knowledge into an existing model. Instead, when learning on new data without revisiting past data they experience catastrophic forgetting. This property is the main focus of the sub-field of Continual Learning. The absence of this property leads to various practical consequences. Among them, the computationally expensive nature of learning algorithms that revisit all previously seen data, which comes at a non-negligible energy cost, and privacy issues related to the requirement to store old
data for later training.
In this thesis, we investigate the impact of learning in a continual manner on the performance of neural networks, more specifically for classification tasks in computer vision. We investigate the causes of catastrophic forgetting within several commonly studied setups of continual learning. We study the continual learning setting where data associated to distinct set of classes arrive incrementally. Under this setting, we investigate how the difficulty of learning cross-task features accounts for the loss in performance. Part of the thesis is dedicated to the more complex setting of online continual learning, and the problem of the stability gap. We investigate the impact of temporal ensembling on the stability gap and see that we can drastically reduce it by applying an ensembling method at evaluation time, not influencing the training process. In addition, we realise a survey of online continual learning methods and conclude that
they might be more affected by an under-fitting problem than by the non-iid training procedure. Finally, we focus on bigger models that have had a strong first learning experience, and study the impact of continual learning on smaller experiences when using low-rank parameter updates.
Keywords: deep learning, continual learning, online learning.