Research Mapping Based on Title Extraction Using Dependency Parser and K-Means Clustering

Adhi Dharma Wibawa, Prio Adi Ramadhani, Alfonsus Haryo Sangaji

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Research is one of the core activities in the university. Thousands of research articles are produced every year as a form of dissemination of knowledge resulting from research activities. All data regarding research articles are stored properly on the university server. But all that data just becomes a pile of data with an increasing volume of data without any meaningful treatment. In this study, the processing of the data was carried out for research mapping. Research topics or research areas can be reflected in the terms used by authors in writing research article titles. Research terms are extracted automatically using one of the natural language processing techniques, namely the dependency parser. To cluster the research terms, word embedding is done first using the Word2Vec model so that word features are generated in the form of a Vector Space Model. The word features are then used as input to the K-Means clustering algorithm. The results of clustering are evaluated based on the number of minus silhouettes in several K clusters. The optimal number of K in this study was seven, with the percentage of the minus silhouette at 8.29%. This optimal K number is also confirmed through the scatter plot visualization. Furthermore, the clusters are labeled based on the topic modeling of each cluster using the Latent Dirichlet Allocation model.

Original languageEnglish
Title of host publicationProceedings - 2022 9th International Conference on Information Technology, Computer and Electrical Engineering, ICITACEE 2022
EditorsTeguh Prakoso, Munawar Agus Riyadi, M. Arfan, Yosua Alvin Adi Soetrisno, Hadha Afrisal
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages287-292
Number of pages6
ISBN (Electronic)9781665471480
DOIs
Publication statusPublished - 2022
Event9th International Conference on Information Technology, Computer and Electrical Engineering, ICITACEE 2022 - Semarang, Indonesia
Duration: 25 Aug 202226 Aug 2022

Publication series

NameProceedings - 2022 9th International Conference on Information Technology, Computer and Electrical Engineering, ICITACEE 2022

Conference

Conference9th International Conference on Information Technology, Computer and Electrical Engineering, ICITACEE 2022
Country/TerritoryIndonesia
CitySemarang
Period25/08/2226/08/22

Keywords

  • dependency parser
  • k-means
  • research mapping
  • research topics

Fingerprint

Dive into the research topics of 'Research Mapping Based on Title Extraction Using Dependency Parser and K-Means Clustering'. Together they form a unique fingerprint.

Cite this