International Journal of Data Science, Engineering, and Analytics (IJDASEA)
Vol. 2 No. 1 (2022): International Journal of Data Science, Engineering, and Analytics Vol 2, No 1,

Metric Comparison For Text Classification

Amri Muhaimin (Universitas Pembangunan Nasional "Veteran" Jawa Timur)
Tresna Maulana Fahrudin (UPN "Veteran" Jawa Timur)
Trimono (UPN "Veteran" Jawa Timur)
Prismahardi Aji Riyantoko (UPN "Veteran" Jawa Timur)
Kartika Maulida Hindrayani (UPN "Veteran" Jawa Timur)



Article Info

Publish Date
28 May 2022

Abstract

Text classifications have been popular in recent years. To classify the text, the first step that needs to be done is to convert the text into some value. Some values that can be used, such as Term Frequencies, Inverse Document Frequencies, Term Frequencies – Inverse Document Frequencies, and Frequency of the word itself. This study aims to get which metric value is best in text classification. The method used is Naïve Bayes, Logistic Regression, and Random Forest. The evaluation score that is used is accuracy and Area Under Curve value. It comes out that some metric values produce similar evaluation scores. Another finding is that Random Forest is the best method among others, also the best metric for text classification is Term Frequencies – Inverse Document Frequencies.

Copyrights © 2022






Journal Info

Abbrev

ijdasea

Publisher

Subject

Computer Science & IT Decision Sciences, Operations Research & Management

Description

Focus and Scope The IJDASEA International Journal of Data Science, Engineering, and Analytics publishes original papers in the field of computer science which covers the following scope: 1. Theoretical Foundations: Probabilistic and Statistical Models and Theories Optimization Methods Data ...