Sriwijaya Journal of Informatics and Applications
Vol 1, No 1 (2020)

Effect of N-Gram on Document Classification on the Naïve Bayes Classifier Algorithm

Fitria Khoirunnisa (Department Informatics Engineering, Faculty of Computer Science, Sriwijaya University)
Novi Yusliani, M.T. (Department Informatics Engineering, Faculty of Computer Science, Sriwijaya University)
Desty Rodiah, M.T. (Department Informatics Engineering, Faculty of Computer Science, Sriwijaya University)



Article Info

Publish Date
11 Aug 2020

Abstract

News has become a major need for everyone, with news we can get the information needed. News can be distributed in the form of print mass media, electronic mass media and online media. The means of spreading the news now have grown very rapidly, making the amount of information being managed are bigger and word management classified also not small.  herefore, we need a system for classifying documents that are not structured. In this study, word processing in a document is done by N-Gram as a feature generation. The document classification process is carried out using the Naïve Bayes Classifier algorithm. This study examines the effect of N-Gram on document classification on the Naïve Bayes Classifier algorithm. The results of the classification accuracy of documents by applying N-Gram is 32.68% and without applying N-Gram is 84.97%. A decrease in the classification results occurs the number of features that result from solving N-Gram that is unique or dominant to another category. The accuracy of the results obtained shows that the application of N-Gram in the classification of documents using the Naïve Bayes Classifier algorithm gives a decreased effect on the performance of the classification

Copyrights © 2020






Journal Info

Abbrev

Publisher

Subject

Description

Sriwijaya Journal of Informatics and Applcations (SJIA) is a scientific periodical researchs articles of the Informatics Departement Universitas Sriwijaya. This Journal is an open access journal for scientists and engineers in informatics and Applcations area that provides online publication (two ...