Abstract
Electricity theft continues to pose significant challenges for power utilities, especially in developing countries, resulting in considerable non-technical losses and operational inefficiencies. This study presents a machine learning-based framework for electricity theft detection that integrates customer consumption pattern behavior such as tariff category, contracted power and service type with geographic context, including transformer-level fraud rates and district-level poverty indices. The model is trained and tested using real-world data from PLN, derived from on-site inspection records conducted between 2019 and 2023, encompassing over 6.7 million rows of data collected through the Electricity Usage Enforcement Program. Fifteen classifiers are compared under stratified 10-Fold Cross Validation on the 70 percent training split and hold out 30 percent of the data for final testing, avoiding synthetic oversampling to preserve genuine data distribution. The top model Gradient Boosting Classifier achieves an F1-score of 0.92 and AUC of 0.85 on a holdout dataset, detecting 93% of all inspections, both theft and non-thefts at 92% precision. Feature-importance and confusion-matrix analyses confirm that our framework excels at minimizing false positives while surfacing the most informative risk indicators for targeted inspections. By leveraging solely real inspection data and scalable preprocessing pipelines, this approach provides utilities with an intelligent, data-driven tool for proactive fraud prevention, optimized resource allocation, and significant reduction of non-technical losses.
| Original language | English |
|---|---|
| Title of host publication | 2025 International Conference on Data Science and Its Applications, ICoDSA 2025 |
| Publisher | Institute of Electrical and Electronics Engineers Inc. |
| Pages | 534-539 |
| Number of pages | 6 |
| ISBN (Electronic) | 9798331598549 |
| DOIs | |
| Publication status | Published - 2025 |
| Event | 8th International Conference on Data Science and Its Applications, ICoDSA 2025 - Hybrid, Jakarta, Indonesia Duration: 3 Jul 2025 → 5 Jul 2025 |
Publication series
| Name | 2025 International Conference on Data Science and Its Applications, ICoDSA 2025 |
|---|
Conference
| Conference | 8th International Conference on Data Science and Its Applications, ICoDSA 2025 |
|---|---|
| Country/Territory | Indonesia |
| City | Hybrid, Jakarta |
| Period | 3/07/25 → 5/07/25 |
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 10 Reduced Inequalities
Keywords
- behavioral features
- electricity theft detection
- geospatial analysis
- gradient boosting
- machine learning
Fingerprint
Dive into the research topics of 'Machine Learning-based Electricity Theft Detection Considering Customer Consumption Pattern and Geographical Condition'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver