Garuda - Garba Rujukan Digital

p-Index From 2020 - 2025

4.209

P-Index

This Author published in this journals

All Journal International Journal of Electrical and Computer Engineering Jurnal Pendidikan Teknologi dan Kejuruan Jurnal Teknologi dan Sistem Komputer Sinkron : Jurnal dan Penelitian Teknik Informatika JUTIK : Jurnal Teknologi Informasi dan Komputer Jusikom : Jurnal Sistem Komputer Musirawas Jurnal Informatika Universitas Pamulang MATRIK : Jurnal Manajemen, Teknik Informatika, dan Rekayasa Komputer SISFO Jurnal RESISTOR (Rekayasa Sistem Komputer) WIDYABHAKTI Jurnal Ilmiah Populer J-SAKTI (Jurnal Sains Komputer dan Informatika) Building of Informatics, Technology and Science Journal of Electrical, Electronics and Informatics Indonesian Journal of Data and Science Tematik : Jurnal Teknologi Informasi Komunikasi JPM: JURNAL PENGABDIAN MASYARAKAT International Journal of Engineering, Science and Information Technology PROFICIO: Jurnal Pengabdian Masyarakat J-SAKTI (Jurnal Sains Komputer dan Informatika) JURNAL ELEKTRO DAN INFORMATIKA SWADHARMA (JEIS) Jurnal WIDYA LAKSMI (Jurnal Pengabdian Kepada Masyarakat) Jurnal Pengabdian Masyarakat Tapis Berseri Journal of Social Work and Empowerment Darma Abdi Karya: Jurnal Pengabdian Kepada Masyarakat JGGAG (Journal of Games, Game Art, and Gamification)

Christina Purnama Yanti

STMIK STIKOM Indonesia

Author-ID : 864385

Aerospace Engineering Agriculture, Biological Sciences & Forestry Arts Humanities Astronomy Biochemistry, Genetics & Molecular Biology Chemical Engineering, Chemistry & Bioengineering Chemistry Civil Engineering, Building, Construction & Architecture Computer Science & IT Control & Systems Engineering Decision Sciences, Operations Research & Management Earth & Planetary Sciences Economics, Econometrics & Finance Education Electrical & Electronics Engineering Energy Engineering Environmental Science Industrial & Manufacturing Engineering Languange, Linguistic, Communication & Media Library & Information Science Materials Science & Nanotechnology Mathematics Mechanical Engineering Medicine & Pharmacology Physics Public Health Social Sciences Transportation Other

Published : 31 Documents Claim Missing Document

Claim Missing Document

Articles

1 2 3 4

Evaluation Analysis of the Necessity of Stemming and Lemmatization in Text Classification Saraswati, Ni Wayan Sumartini; Yanti, Christina Purnama; Muku, I Dewa Made Krishna; Dewi, Dewa Ayu Putu Rasmika
MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer Vol. 24 No. 2 (2025)
Publisher : Universitas Bumigora

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30812/matrik.v24i2.4833

Stemming and lemmatization are text preprocessing methods that aim to convert words into their root and to the canonical or dictionary form. Some previous studies state that using stemming and lemmatization worsens the performance of text classification models. However, some other studies report the positive impact of using stemming and lemmatization in supporting the performance of text classification models. This study aims to analyze the impact of stemming and lemmatization in text classification work using the support vector machine method, in this case, devoted to English text datasets and Indonesian text datasets, and analyze when this method should be used. The analysis of the experimental results shows that the use of stemming will generally degrade the performance of the text classification model, especially on large and unbalanced datasets. The research process consisted of several stages: text preprocessing using stemming and lemmatization, feature extraction with Term Frequency-Inverse Document Frequency (TF-IDF), classification using SVM, and model evaluation with 4 experiment scenarios. Stemming performed the best computation time, completing in 4 hours, 51 minutes, and 41.3 seconds on the largest dataset. While lemmatization positively impacts classification performance on small datasets, achieving 91.075% accuracy results in the worst computation time, especially for large datasets, which take 5 hours, 10 minutes, and 25.2 seconds. The Experimental results also show that stemming from the Indonesian balanced dataset yields a better text classification model performance, reaching 82.080% accuracy.

Co-Authors Andika, I Gede Andika, I Gede Aristamy, I Gusti Ayu Agung Mas Aristana, Made Dona Wahyu Cardewa, Made Daniel Eka Saputra Dewi, Dewa Ayu Putu Rasmika Dwi Novitasari Dwipayana, Ida Bagus Rama Gede Indrawan Hendrawati, Theresia I Dewa Made Krishna Muku I Gede Andika I Gede Andika I Gede Andika I Gede Iwan Sudipa I Gusti Agung Indrawan I Made Cristian Arta Kusuma I Made Marthana Yusa I Made Marthana Yusa I Nyoman Tri Anindia Putra I Wayan Dharma Suryawan Ida Bagus Nyoman Pascima Juniartini, Ni Komang Tri Ketut Laksmi Maswari Made Cardewa Made Suci Ariantini N.L.Wiwik Sri Rahayu Ginantra Ni K. N Noviani Pande Ni Kadek Nita Noviani Pande Ni Ketut Utami Nilawati Ni Komang Ita Cahyani NI LUH PUTU AGETANIA . NI LUH PUTU MERY MARLINDA Ni Luh Wiwik Sri Rahayu Ginantra Ni Made Lisma Martarini Ni Putu Anik Juniantini Nirwana, Ni Kade Ayu Patrycia Dewi, Ni Putu Eka Pramita, Dewa Ayu Kadek Pramitha, Gede Dana Putra S., I Ketut Yama Cahyana Putu Wirayudi Aditama Riana, Roni Sandhiyasa, I Made Subrata Santi Ika Murpratiwi Saputra, Daniel Eka Saraswati, Ni Wayan Sumartini Sastaparamitha, Ni Nyoman Ayu J Tjokorda Istri Agung Pandu Yuni Maharani Toraja, Dewa Gede Waas, Devi Valentino Winatha, Komang Redy Wulandari, Dewa Ayu Putri Yuri Prima Fittryani

Title

Found 31 Documents
Search

Abstract

Title Search

Found 31 Documents Search

Abstract

Title

Found 31 Documents
Search