Multimodal Stereo from Thermal Infrared and Visible Spectrum
Place: Large Lecture Room
Affiliation: Computer Vision Centre and Dep. of Computer Science, UAB
Recent advances in thermal infrared imaging (LWIR) has allowed its use in applications beyond of the military domain. Nowadays, this new family of sensors is included in different technical and scientific applications. They offer features that facilitate tasks, such as detection of pedestrians, hot spots, differences in temperature, among others, which can significantly improve the performance of a system where the persons are expected to play the principal role. For instance, video surveillance applications, monitoring, and pedestrian detection.
During this dissertation the next question is stated: Could a couple of sensors measuring different bands of the electromagnetic spectrum, as the visible and thermal infrared, be used to extract depth information? Although it is a complex question, we shows that a system of these characteristics is possible as well as their advantages, drawbacks, and potential opportunities.
The matching and fusion of data coming from different sensors, as the emissions registered at visible and infrared bands, represents a special challenge, because it has been showed that theses signals are weak correlated. Therefore, many traditional techniques of image processing and computer vision are not helpful, requiring adjustments for their correct performance in every modality.
In this research an experimental study that compares different cost functions and matching approaches is performed, in order to build a multimodal stereovision system. Furthermore, the common problems in infrared/visible stereo, specially in the outdoor scenes are identified. Our framework summarizes the architecture of a generic stereo algorithm, at different levels: computational, functional, and structural, which can be extended toward high-level fusion (semantic) and high-order (prior).The proposed framework is intended to explore novel multimodal stereo matching approaches, going from sparse to dense representations (both disparity and depth maps). Moreover, context information is added in form of priors and assumptions. Finally, this dissertation shows a promissory way toward the integration of multiple sensors for recovering three-dimensional information.