Text Clustering of Tweets Categories on PT. Transportasi Jakarta Official Account

Gabriella Varitie Sentosa Rachmat, Irhamah*, Kartika Fithriasari

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Because of the huge number of private cars going through the streets of Jakarta, traffic congestion develops regularly, prompting the Provincial Government to establish TransJakarta. Often the TransJakarta users wish to ask questions, file complaints or add suggestions to TransJakarta via Twitter. To make it easier and faster for TransJakarta to respond to tweets, it is vital for them to understand the categories of tweets. In order to do this, Tweet categories were determined using data collected from the Twitter API. The text preprocessing was done first then proceeded with calculating and weighting each word using Term Frequency-Inverse Document Frequency (TF-IDF). In addition, Genetic Algorithm (GA) was proposed to be used in feature selection. K-means and Density-Based Spatial Clustering of Applications with Noise (DBSCAN) methods are compared based on the silhouette coefficient value to determine the categories of tweets and then visualized using word clouds. The clustering results show that the best method is DBSCAN with GA-based feature selection because it produces a high silhouette coefficient value with less noise than without GA-based feature selection. Clustering obtained four categories of tweets, namely bus stop/route, bus facilities, bus cleanliness, and TransJakarta's consistency.

Original languageEnglish
Title of host publicationProceedings of the International Conference on Advanced Technology and Multidiscipline, ICATAM 2021
Subtitle of host publication"Advanced Technology and Multidisciplinary Prospective Towards Bright Future" Faculty of Advanced Technology and Multidiscipline
EditorsPrihartini Widiyanti, Prastika Krisma Jiwanti, Gunawan Setia Prihandana, Ratih Ardiati Ningrum, Rizki Putra Prastio, Herlambang Setiadi, Intan Nurul Rizki
PublisherAmerican Institute of Physics Inc.
ISBN (Electronic)9780735444423
DOIs
Publication statusPublished - 19 May 2023
Event1st International Conference on Advanced Technology and Multidiscipline: Advanced Technology and Multidisciplinary Prospective Towards Bright Future, ICATAM 2021 - Virtual, Online
Duration: 13 Oct 202114 Oct 2021

Publication series

NameAIP Conference Proceedings
Volume2536
ISSN (Print)0094-243X
ISSN (Electronic)1551-7616

Conference

Conference1st International Conference on Advanced Technology and Multidiscipline: Advanced Technology and Multidisciplinary Prospective Towards Bright Future, ICATAM 2021
CityVirtual, Online
Period13/10/2114/10/21

Fingerprint

Dive into the research topics of 'Text Clustering of Tweets Categories on PT. Transportasi Jakarta Official Account'. Together they form a unique fingerprint.

Cite this