Small group pedestrian crossing behaviour prediction using temporal angular 2D skeletal pose

Hanugra Aulia Sidharta, Berlian Al Kindhi, Eko Mulyanto Yuniarno, Mauridhi Hery Purnomo*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

A pedestrian is classified as a Vulnerable Road User (VRU) because they do not have the protective equipment that would make them fatal if they were involved in an accident. An accident can happen while a pedestrian is on the road, especially when crossing the road. To ensure pedestrian safety, it is necessary to understand and predict pedestrian behaviour when crossing the road. We propose pedestrian intention prediction using a 2D pose estimation approach with temporal angle as a feature. Based on visual observation of the Joint Attention in Autonomous Driving (JAAD) dataset, we found that pedestrians tend to walk together in small groups while waiting to cross, and then this group is disbanded on the opposite side of the road. Thus, we propose to perform prediction with small group of pedestrians, based on pedestrian statistical data, we define a small group of pedestrians as consisting of 4 pedestrians. Another problem raised is 2D pose estimation is processing each pedestrian index individually, which creates ambiguous pedestrian index in consecutive frame. We propose Multi Input Single Output (MISO), which has capabilities to process multiple pedestrians together, and use summation layer at the end of the model to solve the ambiguous pedestrian index problem without performing tracking on each pedestrian. The performance of our proposed model achieves model accuracy of 0.9306 with prediction performance of 0.8317.

Original languageEnglish
Article number100341
JournalArray
Volume22
DOIs
Publication statusPublished - Jul 2024

Keywords

  • 2D pose estimation
  • Pedestrian crossing behaviour
  • Small group of pedestrian
  • Temporal angular

Fingerprint

Dive into the research topics of 'Small group pedestrian crossing behaviour prediction using temporal angular 2D skeletal pose'. Together they form a unique fingerprint.

Cite this