Opinion Spam Detection in Product Reviews Using Self-Training Semi-Supervised Learning Approach

Dini Adni Navastara, Ana Alimatus Zaqiyah, Chastine Fatichah

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

7 Citations (Scopus)

Abstract

The review of a product can influence a buyer's decision to buy the product. In addition to influencing buyer decisions, fake reviews can also confuse buyers who are looking for product information from honest and genuine reviews. We need a system that can filter spam to reduce the negative influence on product selling and product review writings. Spam that will be detected is the type of brand only spam and not a review. Those types get the initial label through manual labeling. Manual labeling requires a lot of time and effort. Therefore, in this paper, we proposed a self-training semi-supervised learning approach. This method labels spam from the prediction of the labeled training data. The best results were obtained with a scenario without stemming, merging of review centric features and bigram, SMOTE borderline1 oversampling and Polynomial SVM kernel that has accuracy 86.33%.

Original languageEnglish
Title of host publication2019 International Conference on Advanced Mechatronics, Intelligent Manufacture and Industrial Automation, ICAMIMIA 2019 - Proceeding
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages169-173
Number of pages5
ISBN (Electronic)9781728130903
DOIs
Publication statusPublished - Oct 2019
Event2019 International Conference on Advanced Mechatronics, Intelligent Manufacture and Industrial Automation, ICAMIMIA 2019 - Batu-Malang, Indonesia
Duration: 9 Oct 201910 Oct 2019

Publication series

Name2019 International Conference on Advanced Mechatronics, Intelligent Manufacture and Industrial Automation, ICAMIMIA 2019 - Proceeding

Conference

Conference2019 International Conference on Advanced Mechatronics, Intelligent Manufacture and Industrial Automation, ICAMIMIA 2019
Country/TerritoryIndonesia
CityBatu-Malang
Period9/10/1910/10/19

Keywords

  • Oversampling SMOTE
  • Review Centric Features
  • Self Training
  • Semi-Supervised Learning
  • Support Vector Machine
  • bigram

Fingerprint

Dive into the research topics of 'Opinion Spam Detection in Product Reviews Using Self-Training Semi-Supervised Learning Approach'. Together they form a unique fingerprint.

Cite this