Normalization of Unstructured Indonesian Tweet Text for Presidential Candidates Sentiment Analysis

  • Taufikur Rahman
  • , Fenty Eka Muzayyana Agustin
  • , Nurul Faizah Rozy

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Citations (Scopus)

Abstract

Indonesian tweet text has many of unstructured text. This research aims to propose the pre-processing task for cleaning tweets from the abnormal text. The first step, we use the common pre-processing task (case folding, filtering, tokenizing). Second, we use normalization. Each word is found with an excess letter, the word abbreviation, the word coincide and the word slang in each document will be converted into a standard word and also if a word or letter that does not have meaning will be deleted. After the normal text form is formed, stopword removal and stemming are then carried out. The data are taken from 2018 and 2019 data. This study produced the highest accuracy (81%) in 2018 with 425 tweets of training data and 100 tweet testing data and positive sentiment of Prabowo's electability is 52%. This result means that Prabowo deserves to submit himself as a presidential candidate.

Original languageEnglish
Title of host publication2019 7th International Conference on Cyber and IT Service Management, CITSM 2019
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781728129099
DOIs
Publication statusPublished - Nov 2019
Externally publishedYes
Event7th International Conference on Cyber and IT Service Management, CITSM 2019 - Jakarta, Indonesia
Duration: 6 Nov 20197 Nov 2019

Publication series

Name2019 7th International Conference on Cyber and IT Service Management, CITSM 2019

Conference

Conference7th International Conference on Cyber and IT Service Management, CITSM 2019
Country/TerritoryIndonesia
CityJakarta
Period6/11/197/11/19

Keywords

  • Confusion Matrix
  • Lexicon
  • Naïve Bayes Classifier
  • Sentiment Analysis
  • Twitter
  • opinion mining

Fingerprint

Dive into the research topics of 'Normalization of Unstructured Indonesian Tweet Text for Presidential Candidates Sentiment Analysis'. Together they form a unique fingerprint.

Cite this