Jurnal Komunikasi
Vol. 3 No. 9 (2025): Jurnal Komunikasi

DATA MINING ANALYSIS USING THE KNN ALGORITHM TO DETERMINE THE ACTIVE AND INACTIVE STATUS OF 5TH SEMESTER INFORMATION SYSTEMS STUDENTS (CASE STUDY: SEPULUH NOPEMBER UNIVERSITY PAPUA)

Joi Rosalina Raweyai (Unknown)
Dina Antonia Hombore (Unknown)
Maria Monalisa Bebari (Unknown)
Jolio Up (Unknown)
Jenifer Sirami (Unknown)
Heru Sutejo (Unknown)



Article Info

Publish Date
16 Dec 2025

Abstract

The accelerated advancement of information technology requires higher education institutions, including Universitas Sepuluh Nopember Papua (USNP), to leverage data analytics in support of strategic decision-making, particularly in the management of student activity status. One of the major challenges faced is the early and accurate identification of students at risk of becoming inactive, especially in the fifth semester, which represents a critical stage of study where inactivity rates of approximately 15–22% have been identified within the Information Systems program. This study seeks to address the limited body of research that examines student status classification within the context of universities in Eastern Indonesia. Accordingly, the primary objective of this research is to examine the activity patterns of fifth-semester students through the development of a classification model based on the K-Nearest Neighbor (KNN) algorithm. This study adopts a quantitative research design combined with computational experimentation, utilizing a total population sample of 80 students from the 2023 Information Systems cohort. The research relies on secondary data obtained from the datasets “Mahasiswa SI 2023.xlsx” and “ipk mhs aktif sistem informasi.xlsx”. The dependent variable examined is Student Status (Active/Inactive), while the independent variables include Cumulative Grade Point Average (IPK), the number of credits successfully completed, and other relevant administrative attributes. Data preprocessing procedures consist of dataset integration, data cleaning, imputation of missing IPK values using the mean value (2.891 based on 58 observations), label encoding (Active = 1, Inactive = 0), and normalization of numerical features. The K-Nearest Neighbor (KNN) classification model is developed using the Euclidean distance metric, with several K values (3, 5, and 7) evaluated to determine optimal performance. Model effectiveness is subsequently assessed using accuracy, precision, recall, and F1-score metrics. The results are expected to show that the K-Nearest Neighbor (KNN) algorithm is capable of accurately classifying student status, with Cumulative Grade Point Average (IPK) identified as a key influencing variable. This study contributes by developing a KNN-based classification model specifically designed to predict student engagement in the fifth semester, thereby providing Universitas Sepuluh Nopember Papua (USNP) with a practical, data-driven analytical tool to support early intervention initiatives and enhance the quality of academic services.

Copyrights © 2025






Journal Info

Abbrev

komunikasi

Publisher

Subject

Social Sciences

Description

Jurnal Komunikasi menerbitkan artikel penelitian dari berbagai topik dalam ilmu komunikasi. Jurnal merupakan ruang interdisipliner yang mewadahi penelitian terkait komunikasi dan media yang tidak terbatas pada komunikasi interpersonal, komunikasi massa, periklanan, strategi komunikasi, dan studi ...