Combination of DenseNet and BiLSTM Model for Indonesian Image Captioning

Dini Adni Navastara*, Dwinanda Bagoes Ansori, Nanik Suciati, Zulfiqar Fauzul Akbar

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Humans can capture images of the surrounding environment by using camera. But the camera is not able to turn those images into representative information. From the image, feature extraction is carried out to get the objects in the image. These objects can be turned into information through image captioning. It takes a model that is trained with machine learning in order to transform a collection of objects into informative words. Long Short-Term Memory (LSTM) and Bidirectional Long Short-Term Memory (BiLSTM) is a model that can remember a collection of information that has been stored for a long time, while removing irrelevant information. The dataset used is flickr30k, and the original dataset was taken at several sidewalk points in Surabaya. Training conducted on the dataset will produce an image captioning model and will be tested using the BLEU score to test the degree of correspondence between the model caption and the original caption. The results showed that the best model was a model trained in Indonesian, feature extraction (encoder) using DenseNet-201, decoder using one layer LSTM and two layers BiLSTM with attention, tanh activation, and adam optimizer with BLEU-1, BLEU-2, BLEU-3, and BLEU-4 scores of 0.518, 0.320, 0.165, and 0.080, respectively.

Original languageEnglish
Title of host publication2023 International Conference on Advanced Mechatronics, Intelligent Manufacture and Industrial Automation, ICAMIMIA 2023 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages994-999
Number of pages6
ISBN (Electronic)9798350309225
DOIs
Publication statusPublished - 2023
Event2023 International Conference on Advanced Mechatronics, Intelligent Manufacture and Industrial Automation, ICAMIMIA 2023 - Lombok, Indonesia
Duration: 14 Nov 202315 Nov 2023

Publication series

Name2023 International Conference on Advanced Mechatronics, Intelligent Manufacture and Industrial Automation, ICAMIMIA 2023 - Proceedings

Conference

Conference2023 International Conference on Advanced Mechatronics, Intelligent Manufacture and Industrial Automation, ICAMIMIA 2023
Country/TerritoryIndonesia
CityLombok
Period14/11/2315/11/23

Keywords

  • BLEU
  • BiLSTM
  • DenseNet
  • Image Captioning

Fingerprint

Dive into the research topics of 'Combination of DenseNet and BiLSTM Model for Indonesian Image Captioning'. Together they form a unique fingerprint.

Cite this