Skip to main navigation Skip to search Skip to main content

Fraud Detection in Indonesian Administrative Health Records using Cluster-Based Oversampling Methods

  • Tegar Ganang Satrio Priambodo
  • , Hilmi Zharfan Rachmadi
  • , Fajra Hanifa Nuridi Radam
  • , Laurensia Simanihuruk
  • , Diana Purwitasari
  • Institut Teknologi Sepuluh Nopember

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

This study aims to improve healthcare fraud detection in BPJS Kesehatan's claims verification, where mismatches between billed amounts and INACBGs rates often cause underpayment and financial strain on providers. A key challenge is class imbalance in fraud datasets, limiting conventional detection methods. While prior work used oversampling like SMOTE and ROS-which often generate noisy samples-this study introduces a cluster-based oversampling framework preserving claims data distribution. It combines six cluster-guided techniques (AgglomerativeROS, AgglomerativeSMOTE, DBSCANROS, DBSCANSMOTE, KMeansROS, KMeansSMOTE) with ensemble learning (Decision Tree, Random Forest, Balanced Random Forest, Gradient Boosting, CatBoost). The CatBoost model with KMeansROS achieved strong results (AUC-PRC: 0.93924, precision: 0.85714, recall: 0.92308), improving recall by 19.3%, benefiting fraud detection and financing sustainability.

Original languageEnglish
Title of host publication2025 International Conference on Smart Computing, IoT and Machine Learning, SIML 2025
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798331522780
DOIs
Publication statusPublished - 2025
Event2025 International Conference on Smart Computing, IoT and Machine Learning, SIML 2025 - Hybrid, Surakarta, Indonesia
Duration: 3 Jun 20254 Jun 2025

Publication series

Name2025 International Conference on Smart Computing, IoT and Machine Learning, SIML 2025

Conference

Conference2025 International Conference on Smart Computing, IoT and Machine Learning, SIML 2025
Country/TerritoryIndonesia
CityHybrid, Surakarta
Period3/06/254/06/25

Keywords

  • cluster-based oversampling
  • ensemble learning
  • fraud detection
  • health insurance

Fingerprint

Dive into the research topics of 'Fraud Detection in Indonesian Administrative Health Records using Cluster-Based Oversampling Methods'. Together they form a unique fingerprint.

Cite this