Text summarization of indonesian folklore with word frequency concept

Luh Gede Surya Kartika, Komang Rinartha, Daniel Siahaan

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Storytelling is a fun activity, especially for telling a story to children which can affect the character of our children based on the story. Story from the internet resource can be a long story or a short story. For the former, it needs to make a summary to become shorter. When it is a long one, it is more difficult to tell our kids and it is more relevant to make a summary. For creating a summary, it will be easier to proceed with a computer, consequently, it needs a computer application that automatically provides a short overview of the folklore. The Indonesian folklore is summarized by several processes in a web application with a python programming language as a development tool. The process of summarization begins with processing the text that is entered by the user and it is eliminated in all of the special characters. Then the result is processed in the form of tokenizing to get a set of words. From this process, all of the set's elements that are belong to the stop word list are removed and then the rest of them are counted each for the word frequency that they appear. After finding the word frequency, the sentences are scored by the accumulation of word frequency that they have and are sorted based on this score. Lastly, the sentences are sorted based on each of them appear from the original version and make them be a single paragraph. The result of this research is that text summarization can be carried out with the computer application and python programming languages in the form of a website. The number of sentences that are determined can be selected by the user in the program. Several Indonesian folklores are applied to the application and rated by the reader. Moreover, the application delivers 87% of reader acceptance and it is almost similar to several existing online websites that generate text summary.

Original languageEnglish
Title of host publicationEECCIS 2020 - 2020 10th Electrical Power, Electronics, Communications, Controls, and Informatics Seminar
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages259-262
Number of pages4
ISBN (Electronic)9781728171098
DOIs
Publication statusPublished - 26 Aug 2020
Externally publishedYes
Event10th Electrical Power, Electronics, Communications, Controls, and Informatics Seminar, EECCIS 2020 - Malang, Indonesia
Duration: 26 Aug 202028 Aug 2020

Publication series

NameEECCIS 2020 - 2020 10th Electrical Power, Electronics, Communications, Controls, and Informatics Seminar

Conference

Conference10th Electrical Power, Electronics, Communications, Controls, and Informatics Seminar, EECCIS 2020
Country/TerritoryIndonesia
CityMalang
Period26/08/2028/08/20

Keywords

  • Indonesian folklore
  • Python
  • Summarization
  • Word frequency

Fingerprint

Dive into the research topics of 'Text summarization of indonesian folklore with word frequency concept'. Together they form a unique fingerprint.

Cite this