TY - GEN
T1 - Comparison of sampling methods for handling imbalance data in deep learning-based predictions of chest X-ray abnormality tags
AU - Tsaniya, Hilya
AU - Fatichah, Chastine
AU - Suciati, Nanik
N1 - Publisher Copyright:
© 2023 ACM.
PY - 2023/5/12
Y1 - 2023/5/12
N2 - Radiologist detect lung diseases based on abnormality on the chest X-ray images. With the development of computer vision in medical image makes it easier for professional to analyze clinical observation. However, the existing public dataset available suffer from imbalance of abnormality data, and existing solution limits the abnormality into few common diseases. This article trying to explore methods to handling imbalance in chest X-ray abnormality label with grouping minority and majority label based on quartile value. Using Indiana university data with 122 unique labels extracted from patients report using medical indexer, we compare several sampling methods to reduce data imbalance. Sampling method also combined with several neural network classifier model for abnormality tags prediction from X-ray image. From the experiments, Remedial sampling methods got the best result to reduce imbalance with MIR 31 and SCUMBLE 0.10. Remedial also shown best result combined with other classifier averagely with best combination achieved by VGG16 to get the best result for abnormality labels prediction with accuracy 48% and increase in f-1 score 51%, in precision 53%, and in recall 63% than without Remedial sampling.
AB - Radiologist detect lung diseases based on abnormality on the chest X-ray images. With the development of computer vision in medical image makes it easier for professional to analyze clinical observation. However, the existing public dataset available suffer from imbalance of abnormality data, and existing solution limits the abnormality into few common diseases. This article trying to explore methods to handling imbalance in chest X-ray abnormality label with grouping minority and majority label based on quartile value. Using Indiana university data with 122 unique labels extracted from patients report using medical indexer, we compare several sampling methods to reduce data imbalance. Sampling method also combined with several neural network classifier model for abnormality tags prediction from X-ray image. From the experiments, Remedial sampling methods got the best result to reduce imbalance with MIR 31 and SCUMBLE 0.10. Remedial also shown best result combined with other classifier averagely with best combination achieved by VGG16 to get the best result for abnormality labels prediction with accuracy 48% and increase in f-1 score 51%, in precision 53%, and in recall 63% than without Remedial sampling.
KW - Multi label classification
KW - imbalance handling
KW - medical image
KW - radiograph image
UR - http://www.scopus.com/inward/record.url?scp=85178056863&partnerID=8YFLogxK
U2 - 10.1145/3608298.3608300
DO - 10.1145/3608298.3608300
M3 - Conference contribution
AN - SCOPUS:85178056863
T3 - ACM International Conference Proceeding Series
SP - 6
EP - 10
BT - ICMHI 2023 - 2023 the 7th International Conference on Medical and Health Informatics
PB - Association for Computing Machinery
T2 - 7th International Conference on Medical and Health Informatics, ICMHI 2023
Y2 - 12 May 2023 through 14 May 2023
ER -