Pedestrian Intention Prediction for Autonomous Driving

Pedestrian Intention Prediction for Autonomous Driving

Naveed Riad will defended his PhD thesis on May 26, 2026.

What is the thesis about?

Artificial Intelligence has become a cornerstone of intelligent transportation systems (ITS), where anticipating human behavior has a direct impact on road safety. Pedestrian intention prediction (PIP), which determines whether a pedestrian will cross in front of an ego-vehicle, is a critical task that enables autonomous systems to perform proactive and safe maneuvers.
However, progress in PIP has been hindered by three persistent challenges: the scarcity of diverse labeled datasets, the dependence on costly manual labeling, and the complexity of multi-modal approaches that limit real-time deployment.

To address dataset scarcity, this thesis introduces PedSynth, a CARLA based synthetic dataset providing diverse crossing (C) and non-crossing (NC) scenarios that complement real-world datasets such as JAAD and PIE. Building on this, we propose PedGNN, a lightweight graph-based recurrent model that leverages pedestrian skeleton sequences for crossing prediction, optimized for onboard deployment with fast inference and minimal memory footprint.

To overcome manual labeling bottlenecks, we propose S2R-UDA-CP, a synth-to-real unsupervised domain adaptation procedure that combines a synthetic dataset (PedSynth) with an untrained PedGNN to automatically generate C/NC labels for real-world videos. Evaluations with state-of-the-art models like ST-CrossingPose and PedGraph+ show that training on these automatic labels yields performance comparable to human-labeled data, while revealing insights into labeling practices.

Finally, to reduce model complexity, we develop PedGT, a graph-based transformer that combines GCNs for spatial reasoning with transformer encoders for temporal modeling, operating solely on skeleton keypoints and bounding box centers. PedGT achieves state-of-the-art F1-score and recall on PIE and JAAD, and within S2R-UDA-CP attains the highest accuracy with superior training stability. Together, these contributions pave the way for accurate, robust, and real-time PIP systems, strengthening the safety of autonomous driving and ADAS.

Keywords

Intelligent transportation systems (ITS), Vision-based autonomous driving, pedestrian intention prediction, Graph Neural Networks
(GNN), Transformer Models, Synthetic Dataset Generation, Intelligent Transportation Systems (ITS)