Pedestrian Intention Prediction for Autonomous Driving
Naveed Riad will defended his PhD thesis on May 26, 2026.
What is the thesis about?
Artificial Intelligence has become a cornerstone of intelligent transportation systems (ITS), where anticipating human behavior has a direct impact on road safety. Pedestrian intention prediction (PIP), which determines whether a pedestrian will cross in front of an ego-vehicle, is a critical task that enables autonomous systems to perform proactive and safe maneuvers.
However, progress in PIP has been hindered by three persistent challenges: the scarcity of diverse labeled datasets, the dependence on costly manual labeling, and the complexity of multi-modal approaches that limit real-time deployment.
To address dataset scarcity, this thesis introduces PedSynth, a CARLA based synthetic dataset providing diverse crossing (C) and non-crossing (NC) scenarios that complement real-world datasets such as JAAD and PIE. Building on this, we propose PedGNN, a lightweight graph-based recurrent model that leverages pedestrian skeleton sequences for crossing prediction, optimized for onboard deployment with fast inference and minimal memory footprint.
To overcome manual labeling bottlenecks, we propose S2R-UDA-CP, a synth-to-real unsupervised domain adaptation procedure that combines a synthetic dataset (PedSynth) with an untrained PedGNN to automatically generate C/NC labels for real-world videos. Evaluations with state-of-the-art models like ST-CrossingPose and PedGraph+ show that training on these automatic labels yields performance comparable to human-labeled data, while revealing insights into labeling practices.
Finally, to reduce model complexity, we develop PedGT, a graph-based transformer that combines GCNs for spatial reasoning with transformer encoders for temporal modeling, operating solely on skeleton keypoints and bounding box centers. PedGT achieves state-of-the-art F1-score and recall on PIE and JAAD, and within S2R-UDA-CP attains the highest accuracy with superior training stability. Together, these contributions pave the way for accurate, robust, and real-time PIP systems, strengthening the safety of autonomous driving and ADAS.
Keywords
Intelligent transportation systems (ITS), Vision-based autonomous driving, pedestrian intention prediction, Graph Neural Networks
(GNN), Transformer Models, Synthetic Dataset Generation, Intelligent Transportation Systems (ITS)