Towards Robust Multiple-Target Tracking in Unconstrained Human-Populated Environments

CVC has a new PhD on its record!

Dani Rowe successfully defended his dissertation on Computer Science on February 08, 2008, and he is now Doctor of Philosophy by the Universitat Autònoma de Barcelona.

What is the thesis about?

Natural Vision Systems have reached incredible performances in detecting and tracking multiple moving objects simultaneously. Accurate and robust multiple-target tracking is also a key task in many promising Computer-Vision applications. Practical usages of proposed algorithms can now be tackled in real-time thanks to recent technological advances. Further, this represents a huge challenge because of the numerous particular problems involved in such a task. Thus, proposals must deal with multiple highly non-rigid targets which move in an unforeseeable manner through unconstrained dynamic open-world scenarios. In this thesis, a principled hierarchical architecture which fulfils multiple-target tracking is presented. Further, another tracking approach –based on particle filtering– is previously developed and evaluated. Thus, a modular and hierarchically-organised system is designed. It is conformed by a detection level which feeds a two-level tracking subsystem.

Co-operating modules, distributed through this architecture, work following both bottom-up and top-down approaches. Contributions include both the architecture itself, and the development, improvement and integration of the different modules. The proposed architecture introduces the necessary synergies which allow the system to tackle such a problem as unconstrained multiple-target tracking. With respect to the different modules, the main focus is placed on high-level tracking algorithms. Since a careful analysis of motion events is a critical issue for tracking successful, a module for principled event management is proposed, and embedded in the system. Multiple-target interaction events, and a proper scheme for tracker instantiation and removal according to scene events, are considered. Thus, the system is allowed to switch between the two different operation modes implemented, motion-based tracking and appearance-based tracking. This entails another remarkable characteristic of the system: its ability to continuous and independently track numerous targets while they group and split.

Multiple appearance models are built and constantly updated. Special attention is paid to maximize the discrimination between the target and potential distracters by means of an appropriate feature selection, and a wise combination of all available sources of information. It works as a stand-alone application in a non-friendly, complex and dynamic scenario. No a-priori knowledge about either the scene or the targets, based on a previous off-line training period is needed. No camera calibration is required since tracking is achieved without the need for 3D information. Successful tracking has been demonstrated in multiple sequences of both indoor and outdoor scenarios. Accurate and robust localizations have been yielded even during long-term target clustering and occlusions. Results are comprehensively analysed.