Evaluation of the performance of a machine learning algorithms in Swahili-English emails filtering system relative to Gmail classifier

Rashid Abdulla Omar, Aris Tjahyanto

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Citations (Scopus)

Abstract

In recent years, means of communication has changed from writing the letter and sending it by ordinary post to electronic mail. The unsolicited mails is among the security threat in this new technology, with millions of spam distributed daily. The social means of fighting against spammers are there but not effective and so the automatic spam filtering algorithms has been introduced to reduce the effects that are caused by spammers. This paper compares three algorithms that are running using the machine learning (ML) tool called Waikato Environment for Knowledge Analysis (WEKA) by using English-Swahili dataset that author created and the results are then compared with Gmail results that are calculated manually. The algorithms are Naïve Bayes, Sequential Minimal Optimization (SMO) and J48. The findings show that SMO gives good result compared to other algorithms with accuracy of 93.23% followed by Naïve Bayes 88.47% and J48 87.22%. Also, all three algorithms come out with good results ahead of Gmail filter has 86.26% accuracy.

Original languageEnglish
Title of host publication2018 International Conference on Information and Communications Technology, ICOIACT 2018
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages266-269
Number of pages4
ISBN (Electronic)9781538609545
DOIs
Publication statusPublished - 26 Apr 2018
Event1st International Conference on Information and Communications Technology, ICOIACT 2018 - Yogyakarta, Indonesia
Duration: 6 Mar 20187 Mar 2018

Publication series

Name2018 International Conference on Information and Communications Technology, ICOIACT 2018
Volume2018-January

Conference

Conference1st International Conference on Information and Communications Technology, ICOIACT 2018
Country/TerritoryIndonesia
CityYogyakarta
Period6/03/187/03/18

Keywords

  • Gmail
  • J48
  • Naïve Bayes
  • SMO
  • Swahili
  • WEKA
  • spam

Fingerprint

Dive into the research topics of 'Evaluation of the performance of a machine learning algorithms in Swahili-English emails filtering system relative to Gmail classifier'. Together they form a unique fingerprint.

Cite this