Claim Missing Document
Check
Articles

Found 22 Documents
Search

Komparasi Performa Algoritma Kompresi Data Lossless Menggunakan Rasio Kompresi Dan Penghematan Ruang Aswar Hanif; Endang Wahyudi; Harna Adianto; Lilik Martanto
J-INTECH ( Journal of Information and Technology) Vol 11 No 1 (2023): J-Intech : Journal of Information and Technology
Publisher : LPPM STIKI MALANG

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.32664/j-intech.v11i1.863

Abstract

Data growth is a sizeable challenge. The goal of data compression is to reduce the size of data needed to still represent useful information. Data compression can be used to increase the efficiency of data storage, transmission and protection. Lossless algorithms can precisely reconstruct the original data from the compressed data. Lossless compression is often used for data that needs to be stored or transmitted accurately. Several lossless compression methods and algorithms include the Lempel–Ziv–Markov chain algorithm (LZMA), Prediction by partial matching (PPM), Burrows-Wheeler block sorting text compression algorithm and Huffman coding (BZip2), and Deflate. Even though all compression systems are based on the same principles, there should still be differences in performance. Because of that, a general guide is needed to help determine the most appropriate data compression algorithm to use. This study aims to determine the data compression algorithm that has the best performance, based on a comparison using the Compression Ratio and Space Saving values. The research phase begins with determining the compression algorithm used, data preparation, performance testing, to then be discussed and conclusions drawn. The results show that the compression ratio and space savings that can be achieved specifically will depend on the data used. Although the range of average values of compression performance is not that big, in general LZMA2 shows the best results with a compression ratio of 1.457 and a space saving of 15.00%. Hopefully, the results of this test can be used as an overview in helping to choose a lossless data compression algorithm.
Pemilihan Model Churn pada Data Tidak Seimbang Berdasarkan ROC AUC dan Recall Aswar Hanif; Harna Adianto; Lilik Martanto; Endang Wahyudi
Jurnal Nasional Komputasi dan Teknologi Informasi (JNKTI) Vol 8, No 5 (2025): Oktober 2025
Publisher : Program Studi Teknik Komputer, Fakultas Teknik. Universitas Serambi Mekkah

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.32672/jnkti.v8i5.9821

Abstract

Abstrak - Customer churn adalah sebuah keadaan di mana pelanggan menghentikan hubungan bisnis dengan sebuah usaha. Kemampuan untuk memprediksi customer churn merupakan salah satu faktor penting dalam perencanaan bisnis. Umumnya data customer churn tidak seimbang, dan menjadi tantangan signifikan dalam pembelajaran mesin. Untuk mengatasi masalah ini, pendekatan yang paling sering digunakan adalah oversampling. Metode yang populer adalah SMOTE, yang bisa meningkatkan peforma model, namun juga bisa menyebabkan overfitting. Telah banyak penelitian dilakukan dengan menggunakan oversampling dalam menghadapi data tidak seimbang. Tetapi masih sedikit penelitian yang fokus pada pemilihan model klasifikasi berdasarkan metrik yang sesuai, tanpa menggunakan oversampling. Penelitian ini menguji model-model klasifikasi dalam memprediksi customer churn terhadap data tidak seimbang, baik dengan maupun tanpa menggunakan SMOTE, untuk perbandingan hasil cross-validation dan performa pengujian. Kemudian model-model ini dievaluasi menggunakan metrik Balanced Accuracy. Kebaruan terletak pada fokus bahwa pemilihan model berdasarkan kombinasi ROC AUC dan Recall, bisa menemukan model prediksi customer churn terbaik tanpa harus menggunakan oversampling. Diharapkan hasil ini dapat berkontribusi dalam memperluas wawasan dari asumsi bahwa data tidak seimbang selalu harus diatasi menggunakan oversampling.Kata kunci : Pemilihan model; Data tidak seimbang; Tanpa oversampling; ROC AUC; Recall; Abstract - Customer churn refers to the phenomenon in which a customer ends their relationship with a company. Being able to predict customer churn is crucial for business planning. However, customer churn data is often imbalance, making it a major challenge for machine learning. One way to tackle this issue is oversampling. A widely used approach is SMOTE, which can boost model performance but also risks overfitting. There have been many studies using oversampling to address imbalanced data. However, there's a lack of research on selecting a classification model based on suitable metrics without relying on oversampling. This study evaluates classification models for predicting customer churn on imbalanced datasets, comparing performance with and without the application of SMOTE using cross-validation and test results. Subsequently, the models are evaluated using the Balanced Accuracy metric. This study introduces a novel approach in which model selection based on a combination of ROC AUC and Recall identifies the optimal customer churn prediction model without the need for oversampling. These results may broaden understanding beyond the prevailing assumption that imbalanced data must always be addressed using oversampling.Keywords: Model selection; Data imbalance; Without oversampling; ROC AUC; Recall