TY - GEN
T1 - Deletion Attacks on Database Watermarking for Tracing Partial Data Breach
AU - Achmad, Riki Mi roj
AU - Pratomo, Baskoro Adi
AU - Purwitasari, Diana
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - In the digital era, data breaches and unauthorized data sharing pose significant security risks to organizations. A critical challenge lies in identifying the source of data leaks once they occur. Database watermarking, which embeds hidden own-ership information within datasets, has emerged as a promising solution for data leak tracing and ownership verification. While existing research has introduced various watermarking algorithms, there remains a significant gap in understanding how the size of the database influences these algorithms' effectiveness and resilience. This study specifically investigates the relationship between dataset size and watermark robustness under deletion attacks. Through comprehensive experiments across datasets ranging from 2,270 to 581,012 records, we evaluated watermark accuracy under systematic deletion attacks of up to 99% of the data. Our findings reveal that while the watermarking algorithm maintains perfect accuracy for datasets over 500,000 records even under extreme deletion attacks, its performance varies significantly with dataset size. Datasets around 145,000 records maintain high accuracy until 96% deletion, while smaller datasets of approximately 9,000 records show sharp accuracy decline after 60% deletion. These results provide crucial insights for practical watermarking implementation, establishing clear minimum dataset size requirements and highlighting the need for enhanced techniques for smaller datasets.
AB - In the digital era, data breaches and unauthorized data sharing pose significant security risks to organizations. A critical challenge lies in identifying the source of data leaks once they occur. Database watermarking, which embeds hidden own-ership information within datasets, has emerged as a promising solution for data leak tracing and ownership verification. While existing research has introduced various watermarking algorithms, there remains a significant gap in understanding how the size of the database influences these algorithms' effectiveness and resilience. This study specifically investigates the relationship between dataset size and watermark robustness under deletion attacks. Through comprehensive experiments across datasets ranging from 2,270 to 581,012 records, we evaluated watermark accuracy under systematic deletion attacks of up to 99% of the data. Our findings reveal that while the watermarking algorithm maintains perfect accuracy for datasets over 500,000 records even under extreme deletion attacks, its performance varies significantly with dataset size. Datasets around 145,000 records maintain high accuracy until 96% deletion, while smaller datasets of approximately 9,000 records show sharp accuracy decline after 60% deletion. These results provide crucial insights for practical watermarking implementation, establishing clear minimum dataset size requirements and highlighting the need for enhanced techniques for smaller datasets.
KW - data breach
KW - database watermarking
KW - national security
UR - https://www.scopus.com/pages/publications/105002277111
U2 - 10.1109/ICADEIS65852.2025.10933282
DO - 10.1109/ICADEIS65852.2025.10933282
M3 - Conference contribution
AN - SCOPUS:105002277111
T3 - ICADEIS 2025 - 2025 International Conference on Advancement in Data Science, E-learning and Information System: Integrating Data Science and Information System, Proceeding
BT - ICADEIS 2025 - 2025 International Conference on Advancement in Data Science, E-learning and Information System
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2025 International Conference on Advancement in Data Science, E-learning and Information System, ICADEIS 2025
Y2 - 3 February 2025 through 4 February 2025
ER -