Topic modeling Twitter data using Latent Dirichlet Allocation and Latent Semantic Analysis

Siti Qomariyah*, Nur Iriawan, Kartika Fithriasari

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

21 Citations (Scopus)

Abstract

The industrial world has entered the era of industrial revolution 4.0. In this era, there is an urgent data requirement from the community to support service policies. Because of that, Surabaya Government made Media Center Surabaya. This media is used to accommodate all the aspiration of Surabaya citizen. To access this media, a citizen can use Twitter. The topic which is discussed in Twitter is important information that we need to know. The information can be used to improve the performance of Surabaya Government services. Twitter data is a text data that consists of thousands of variables. Text mining is frequently used to analyze this kind of data, including topic modeling and sentiment analysis. This study would work on topic modeling focused on the algorithm employing Latent Dirichlet Allocation (LDA) and Latent Semantic Analysis (LSA). The evaluation of the algorithm performance uses the topic coherence. As unstructured data, the Twitter data need preprocessing before the analysis. The stages of preprocessing include cleansing, stemming, and stop words. The advantages of LSA are fast and easy to implement. LSA, on the other hand, doesn't consider the relationship between documents in the corpus, while LDA does. This study shows that LDA gives a better result than LSA.

Original languageEnglish
Title of host publication2nd International Conference on Science, Mathematics, Environment, and Education
EditorsNurma Yunita Indriyanti, Murni Ramli, Farida Nurhasanah
PublisherAmerican Institute of Physics Inc.
ISBN (Electronic)9780735419452
DOIs
Publication statusPublished - 18 Dec 2019
Event2nd International Conference on Science, Mathematics, Environment, and Education, ICoSMEE 2019 - Surakarta, Indonesia
Duration: 26 Jul 201928 Jul 2019

Publication series

NameAIP Conference Proceedings
Volume2194
ISSN (Print)0094-243X
ISSN (Electronic)1551-7616

Conference

Conference2nd International Conference on Science, Mathematics, Environment, and Education, ICoSMEE 2019
Country/TerritoryIndonesia
CitySurakarta
Period26/07/1928/07/19

Fingerprint

Dive into the research topics of 'Topic modeling Twitter data using Latent Dirichlet Allocation and Latent Semantic Analysis'. Together they form a unique fingerprint.

Cite this