Claim Missing Document
Check
Articles

Found 2 Documents
Search

CLASSIFICATION OF SMS SPAM WITH N-GRAM AND PEARSON CORRELATION BASED USING MACHINE LEARNING TECHNIQUES Romadloni, Nova Tri; Septiyanti, Nisa Dwi; Pratomo, Cucut Hariz; Kurniawan, Wakhid; Bintang, Rauhulloh Ayatulloh Khomeini Noor
SENTRI: Jurnal Riset Ilmiah Vol. 3 No. 2 (2024): SENTRI : Jurnal Riset Ilmiah, February 2024
Publisher : LPPM Institut Pendidikan Nusantara Global

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.55681/sentri.v3i2.2252

Abstract

The Short Message Service (SMS) has garnered widespread popularity due to its simplicity, reliability, and ubiquitous accessibility.This study aims to enhance the efficacy of SMS classification by refining the classification process itself. Specifically, it strives to streamline the process by diminishing feature dimensions and eliminating inconsequential attributes. The textual data undergoes preprocessing, which involves employing the N-Gram technique for feature representation, followed by meticulous feature selection utilizing Pearson Correlation. The study employs 5 of classification algorithms. Notably, the findings underscore that the optimal outcomes emerge from the fusion of the N-Gram methodology with feature selection through Pearson Correlation. Among these, the Support Vector Machine methodology stands out, exhibiting a remarkable 91.41% enhancement in accuracy without feature selection, a further improvement to 91.96% through N-Gram utilization, and a final performance of 70.80% following the inclusion of weighted correlation. However, it is imperative to acknowledge the limitations inherent in the model's generalizability, primarily stemming from the utilization of a relatively modest dataset. Despite the efficacy of Pearson correlation and N-gram-based feature selection in curbing data dimensionality and enhancing processing efficiency, certain pertinent features may have been overlooked, or the chosen attributes might not be optimally suited for specific classifications.
Application of the K-Nearest Neighbor (KNN) Algorithm for Stunting Diagnosis in Infants Aged 1-12 Months kholik, Moh abdul; Pratomo, Cucut Hariz; Gustina, Sapriani
Jurnal Informatika Universitas Pamulang Vol 9 No 2 (2024): JURNAL INFORMATIKA UNIVERSITAS PAMULANG
Publisher : Teknik Informatika Universitas Pamulang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.32493/informatika.v9i2.40983

Abstract

Stunting in toddlers must be addressed immediately because it has a negative impact on their growth and development. Stunting is a disorder where toddlers experience chronic malnutrition, thus their physical growth and height do not match their age. According to the Indonesian Nutritional Status Survey (SSGI), stunting is more common among toddlers from aged 0 to 1 year than overall. Stunting can have short-term and long-term impacts. This research examines data from the Temanggung District Health Service on 3,999 toddlers aged 0 to 12 months between 2019 and 2022.  Many studies have exclusively looked at stunting in children aged one to five years, especially research on stunting using the KNN method, even though stunting can actually be recognized from an early age. Therefore, researchers are more specific in using the KNN method for cases of babies 1 to 12 months so as to differentiate it from previous researchers. The aim of this research is to use the K-Nearest Neighbor (KNN) algorithm to detect stunting nutritional status in toddlers. K-Nearest Neighbor (KNN) is a classification algorithm that uses a set of K values ​​from the closest data (its neighbors) as a reference to determine the class of incoming data. KNN classifies data based on its similarity or closeness to other data. The dataset used includes parameters of age, gender and height. The research approach is the CRISP-DM (Cross Industry Standard Process for Data Mining) method, which begins with business knowledge, followed by EDA and modeling, evaluation, testing and report preparation. The result shows that the KNN algorithm can accurately categorize children as stunted or not based on age (U) and height (TB), with the maximum level of accuracy and the lowest error rate at k = 5. At this optimal value (k), this algorithm has an accuracy of 99.87%, Recall 99.84%, and precision 99.73.