TY - GEN
T1 - Emergency Sound Classification and Visual Alert System for Enhanced Situational Awareness
AU - Kamelia, Riza
AU - Kusuma, Hendra
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - This research introduces an audio classification system designed to enhance situational awareness for individuals with hearing impairments. The system recognizes emergency sounds and presents corresponding visual alerts. Utilizing Google's pre-trained YAMNet model, it accurately identifies crucial sounds such as ambulance, firetruck, and police sirens, as well as railroad crossing and other danger alarms, distinguishing them from typical background noise. The system, deployed on an Intel Core i5 processor with an NVIDIA GeForce RTX 2050 (compute capability 8.6), extracts audio features from sound files and classifies them using a trained model. Each identified sound category triggers a specific visual indicator on a graphical interface. To enhance robustness in noisy environments, a noise reduction preprocessing step is applied, improving classification accuracy. Testing demonstrates a 93.68% accuracy rate in emergency sound detection with an average latency of 2.9 ms for sound classification. The simulation latency ranges between 120 ms and 200 ms. These results highlight the system's potential for real-world applications in public spaces and personal safety devices. This work represents a significant advancement in accessible alert systems, extending situational awareness tools to the hearing-impaired and contributing to broader public safety.
AB - This research introduces an audio classification system designed to enhance situational awareness for individuals with hearing impairments. The system recognizes emergency sounds and presents corresponding visual alerts. Utilizing Google's pre-trained YAMNet model, it accurately identifies crucial sounds such as ambulance, firetruck, and police sirens, as well as railroad crossing and other danger alarms, distinguishing them from typical background noise. The system, deployed on an Intel Core i5 processor with an NVIDIA GeForce RTX 2050 (compute capability 8.6), extracts audio features from sound files and classifies them using a trained model. Each identified sound category triggers a specific visual indicator on a graphical interface. To enhance robustness in noisy environments, a noise reduction preprocessing step is applied, improving classification accuracy. Testing demonstrates a 93.68% accuracy rate in emergency sound detection with an average latency of 2.9 ms for sound classification. The simulation latency ranges between 120 ms and 200 ms. These results highlight the system's potential for real-world applications in public spaces and personal safety devices. This work represents a significant advancement in accessible alert systems, extending situational awareness tools to the hearing-impaired and contributing to broader public safety.
KW - Artificial Intelligence
KW - Audio Signal Processing
KW - Dangerous Alarm Classification
KW - Machine Learning
KW - YAMNet
UR - https://www.scopus.com/pages/publications/85218349379
U2 - 10.1109/ICTeD62334.2024.10844649
DO - 10.1109/ICTeD62334.2024.10844649
M3 - Conference contribution
AN - SCOPUS:85218349379
T3 - 2024 International Conference on TVET Excellence and Development, ICTeD 2024
SP - 213
EP - 218
BT - 2024 International Conference on TVET Excellence and Development, ICTeD 2024
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2024 International Conference on TVET Excellence and Development, ICTeD 2024
Y2 - 16 December 2024 through 17 December 2024
ER -