Bulletin of Electrical Engineering and Informatics
Vol 10, No 1: February 2021

Visualizing stemming techniques on online news articles text analytics

Nurul Atiqah Razmi (Universiti Teknologi MARA)
Muhammad Zharif Zamri (Universiti Teknologi MARA)
Sharifah Syafiera Syed Ghazalli (Universiti Teknologi MARA)
Noraini Seman (Universiti Teknologi MARA)



Article Info

Publish Date
01 Feb 2021

Abstract

Stemming is the process to convert words into their root words by the stemming algorithm. It is one of the main processes in text analytics where the text data needs to go through stemming process before proceeding to further analysis. Text analytics is a very common practice nowadays that is practiced toanalyze contents of text data from various sources such as the mass media and media social. In this study, two different stemming techniques; Porter and Lancaster are evaluated. The differences in the outputs that are resulted from the different stemming techniques are discussed based on the stemming error and the resulted visualization. The finding from this study shows that Porter stemming performs better than Lancaster stemming, by 43%, based on the stemming error produced. Visualization can still be accommodated by the stemmed text data but some understanding of the background on the text data is needed by the tool users to ensure that correct interpretation can be made on the visualization outputs.

Copyrights © 2021






Journal Info

Abbrev

EEI

Publisher

Subject

Electrical & Electronics Engineering

Description

Bulletin of Electrical Engineering and Informatics (Buletin Teknik Elektro dan Informatika) ISSN: 2089-3191, e-ISSN: 2302-9285 is open to submission from scholars and experts in the wide areas of electrical, electronics, instrumentation, control, telecommunication and computer engineering from the ...