Claim Missing Document
Check
Articles

Found 22 Documents
Search
Journal : JOURNAL OF APPLIED INFORMATICS AND COMPUTING

A Hybrid Approach to Music Recommendations Based on Audio Similarity Using Autoencoder and LightGBM Aristawidya, Winda Ardelia; Rahardi, Majid
Journal of Applied Informatics and Computing Vol. 9 No. 6 (2025): December 2025
Publisher : Politeknik Negeri Batam

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30871/jaic.v9i6.10516

Abstract

Music recommendation systems help users navigate large music collections by suggesting songs aligned with their preferences. However, conventional methods often overlook the depth of audio content, limiting personalization and accuracy. This study proposes a hybrid approach that uses PCA and Autoencoder to extract audio embeddings. These embeddings are processed using K-Nearest Neighbors to find similar tracks, followed by a reranking step with LightGBM based on predicted relevance. The system achieved strong results: 98% accuracy, 0.96 precision, 0.96 recall, and 0.96 F1-score for the Similar class, with 0.99 precision and recall for Not Similar. Cross-validation confirmed model robustness, with an average accuracy of 97.99%, precision of 0.9577, recall of 0.9624, and F1-score of 0.9600, all with low standard deviations. These outcomes show that combining deep audio features with machine learning ranking enhances recommendation quality. Future improvements may involve incorporating metadata and genre-based visualizations for more diverse and interpretable results.
Hyperparameter Optimization and Feature Selection Analysis on the XGBoost Model for Hepatitis C Infection Prediction Lefi, Nadia Martha; Rahardi, Majid
Journal of Applied Informatics and Computing Vol. 9 No. 6 (2025): December 2025
Publisher : Politeknik Negeri Batam

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30871/jaic.v9i6.10876

Abstract

Hepatitis C is a liver disease that can progress to chronic conditions such as cirrhosis and liver cancer. Early detection is essential and can be supported through machine learning approaches. This study analyzes the effect of feature selection and hyperparameter tuning on the performance of the XGBoost model in classifying hepatitis C infection. The dataset, obtained from Kaggle, contains laboratory test attributes. The preprocessing stage involved handling missing values, encoding categorical variables, removing outlier classes, and normalizing data using StandardScaler. After stratified splitting, the training set was balanced using the SMOTE technique. Feature selection was carried out using the ANOVA F-score method, and hyperparameter tuning was performed using GridSearchCV. Three model scenarios were compared: baseline, with feature selection, and with combined feature selection and hyperparameter tuning. The evaluation results showed that the third model achieved the best performance with 96% accuracy, 79% precision, 81% recall, and a 78% F1-score, despite a slight decrease in the ROC AUC value. This approach has proven effective in improving model performance and is relevant for supporting more accurate hepatitis C diagnosis systems.
Multiclass Classification of Tomato Leaf Diseases Using GLCM, Color, and Shape Feature Extraction with Optimized XGBoost Laiskodat, Fransisko Andrade; Rahardi, Majid
Journal of Applied Informatics and Computing Vol. 9 No. 6 (2025): December 2025
Publisher : Politeknik Negeri Batam

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30871/jaic.v9i6.11273

Abstract

Automatic classification of tomato leaf diseases is an essential component in advancing precision agriculture based on artificial intelligence. This study aims to develop a multiclass classification model for tomato leaf diseases by utilizing texture, color, and shape features, and employing an optimized XGBoost algorithm. The public PlantVillage dataset was used, with preprocessing stages including feature extraction, normalization, dimensionality reduction using PCA, and class balancing using SMOTE. The experimental results showed that the model successfully classified ten disease classes with a high accuracy of 97.63%, and both macro and weighted f1-scores of 0.98. These findings indicate that the combination of handcrafted features and XGBoost offers an effective, efficient, and applicable solution for plant disease diagnostic systems.
Efficient Feature Extraction Using MobileNetV2 and EfficientNetB0 for Multi-Class Brain Tumor Classification Amelia, Hemas Anggita; Rahardi, Majid
Journal of Applied Informatics and Computing Vol. 9 No. 6 (2025): December 2025
Publisher : Politeknik Negeri Batam

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30871/jaic.v9i6.11354

Abstract

Brain tumor classification in MRI is complicated by the similarity of imaging features across multiple tumor classes.  This study evaluates the use of lightweight convolutional neural network (CNN) architectures as feature extractors combined with machine learning classifiers for multi-class classification. MobileNetV2 and EfficientNetB0 were used to extract fixed-length feature representations, which were then classified using Support Vector Machine (SVM), Logistic Regression, Random Forest, and K-Nearest Neighbors. The evaluation used stratified five-fold cross-validation, and performance was measured with accuracy, F1-score, and Matthews Correlation Coefficient (MCC). Results show that EfficientNetB0 features paired with SVM achieved the highest test accuracy (98.5%), while Logistic Regression also yielded competitive performance (97.1%). Class-wise analysis indicated strong results for pituitary and non-tumor cases. This work shows that lightweight CNN-based feature extraction may serve as a practical direction for improving multi-class brain tumor MRI classification, with potential benefits for applications in resource-limited environments.
Analysis of Deep Learning Algorithms Using ConvNeXt and Vision Transformer for Brain Tumor Disease Ekayanda, Gilang; Rahardi, Majid
Journal of Applied Informatics and Computing Vol. 9 No. 6 (2025): December 2025
Publisher : Politeknik Negeri Batam

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30871/jaic.v9i6.11438

Abstract

This study aims to conduct a comparative analysis and identify the most effective deep learning architecture between ConvNeXt and Vision Transformer (ViT) for the automated classification of brain tumors from MRI imagery. Rapid and accurate brain tumor diagnosis is crucial; however, the manual interpretation of MRI scans is time-consuming and reliant on specialist expertise, creating an urgent need for reliable automation in brain tumor diagnosis. This research utilizes a dataset of 4,600 images, balanced between 2,513 'Brain Tumor' and 2,087 'Healthy' instances. A robust 5-Fold Cross-Validation methodology was employed to evaluate model performance, wherein the data was divided into five folds, each consisting of 920 images, ensuring every image served as both training and testing data. The quantitative results demonstrated high efficacy from both models, although ConvNeXt achieved a slight, consistent advantage. ConvNeXt obtained an accuracy of 99.13%, precision of 99.13%, recall of 99.13%, and an F1-Score of 99.13%. In comparison, the ViT model scored an accuracy of 98.13%, precision of 98.14%, recall of 98.13%, and an F1-Score of 98.13%. This quantitative superiority was validated through qualitative analysis using saliency maps, which confirmed that the models' computational attention was accurately focused on the anatomical locations of the actual tumor lesions.
Comparative Analysis of Random Forest, SVM, and Naive Bayes for Cardiovascular Disease Prediction Rayadhani, Windy Aldora; Rahardi, Majid
Journal of Applied Informatics and Computing Vol. 9 No. 6 (2025): December 2025
Publisher : Politeknik Negeri Batam

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30871/jaic.v9i6.11451

Abstract

Cardiovascular disease is one of the leading causes of death worldwide; therefore, accurate early detection is essential to reduce fatal risks. This study aims to compare the performance of three machine learning algorithms — Random Forest, Support Vector Machine (SVM), and Naïve Bayes — in predicting cardiovascular disease risk using the Mendeley Cardiovascular Disease Dataset, which contains 1,000 patient records and 14 clinical attributes. The models were evaluated using accuracy, precision, recall, and F1-score metrics, and their performance differences were statistically tested using the paired t-test. The experimental results indicate that the Random Forest algorithm achieved the best performance with 99% accuracy, 100% recall, 98% precision, and an F1-score of 99%. The SVM model followed with 98% accuracy and 100% recall, while the Naïve Bayes algorithm obtained 94.5% accuracy and an F1-score of 95%. The p-value < 0.05 confirmed that the performance differences among the three models were statistically significant. From a clinical perspective, a model with high recall, such as Random Forest, is more desirable because it reduces the likelihood of false negatives, which are critical in heart disease diagnosis. The feature importance analysis also revealed that age, resting blood pressure, and cholesterol level were the most influential factors in predicting cardiovascular risk. These findings suggest that machine learning algorithms, particularly Random Forest, have strong potential to be implemented in Clinical Decision Support Systems (CDSS) for accurate and efficient early detection of cardiovascular disease.
Analysis of Naive Bayes Algorithm for Lung Cancer Risk Prediction Based on Lifestyle Factors Vabilla, Sheila Anggun; Rahardi, Majid
Journal of Applied Informatics and Computing Vol. 9 No. 6 (2025): December 2025
Publisher : Politeknik Negeri Batam

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30871/jaic.v9i6.11463

Abstract

Lung cancer is one of the types of cancer with the highest mortality rate in the world, which is often difficult to detect in the early stages due to minimal symptoms. This study aims to build a lung cancer risk prediction model based on lifestyle factors using the Gaussian Naive Bayes algorithm. Data fit is addressed using the Synthetic Minority Over-sampling Technique (SMOTE), and feature selection is carried out using the Mutual Information. The dataset used consists of 1000 patient data with 24 features related to lifestyle and environmental factors. Model validation is carried out using 5-fold Stratified Cross Validation, and evaluated based on accuracy, precision, recall, and confusion matrices. The results show that the application of SMOTE successfully increases the model accuracy to 91.00% with high precision and recall values in all risk classes (Low, Medium, High). The features "Passive Smoker" and "Coughing up Blood" are identified as the most influential factors in the prediction. The results of this study indicate that the combination of Gaussian Naive Bayes with SMOTE and Mutual Information is able to produce an accurate prediction model.
Sentiment Analysis of the TPKS Law on Twitter: A Comparative Study of Classification Algorithm Performance Mawar, Heni Sapta; Rahardi, Majid
Journal of Applied Informatics and Computing Vol. 9 No. 6 (2025): December 2025
Publisher : Politeknik Negeri Batam

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30871/jaic.v9i6.11503

Abstract

The enactment of Law Number 12 of 2022 concerning the Crime of Sexual Violence (UU TPKS) has sparked significant public discourse on social media, especially on Twitter. This study aims to identify the most effective classification algorithm for analyzing public sentiment regarding the UU TPKS. A total of 2,351 Indonesian-language tweets were collected, preprocessed, and manually labeled into positive and negative sentiments. The Term Frequency–Inverse Document Frequency (TF-IDF) method was used for feature extraction, followed by classification using six algorithms: Naive Bayes (NB), K-Nearest Neighbors (KNN), Logistic Regression (LR), Support Vector Machine (SVM), Random Forest (RF), and XGBoost. The evaluation results show that SVM and Random Forest achieved the highest accuracy of 85.35%, precision of 0.85, recall of 0.85, and F1-score of 0.83, outperforming other models in handling high-dimensional and imbalanced data. These results demonstrate that the combination of TF-IDF with SVM and Random Forest provides an effective and reliable approach for sentiment analysis of Indonesian-language social media data, particularly in evaluating public responses to socio-legal policies such as the UU TPKS.
Segmentation of Generation Z Spending Habits Using the K-Means Clustering Algorithm: An Empirical Study on Financial Behavior Patterns Sylvester, Gunawan; Rahardi, Majid
Journal of Applied Informatics and Computing Vol. 9 No. 6 (2025): December 2025
Publisher : Politeknik Negeri Batam

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30871/jaic.v9i6.11506

Abstract

Generation Z, born between 1997 and 2012, exhibits unique consumption behaviors shaped by digital technology, modern lifestyles, and evolving financial decision-making patterns. This study segments their financial behavior using the K-Means clustering algorithm applied to the “Generation Z Money Spending” dataset from Kaggle. In addition to K-Means, alternative clustering algorithms—K-Medoids and Hierarchical Clustering—are evaluated to compare their effectiveness in identifying behavioral patterns. The dataset consists of 1,700 individuals with 15 numerical spending attributes, including rent, food, entertainment, education, savings, and investments. All data were normalized using Min-Max Scaling prior to clustering. The analysis identifies six distinct clusters, ranging from highly consumption-oriented groups (with higher spending on entertainment and online shopping) to financially conscious groups prioritizing savings and investments. A quantitative approach was used, incorporating exploratory data analysis, correlation testing, and the Elbow Method to determine the optimal number of clusters. The optimal cluster count of six is supported by a Davies-Bouldin Index (DBI) score of 2.412, indicating acceptable but improvable cluster separation. Each cluster displays unique characteristics: Cluster 0 (average age 20.6) focuses on savings and investments with moderate essential spending; Cluster 1 (average age 23.6) prioritizes education and higher rent expenses; Cluster 2 (average age 20.3) is digitally oriented, spending more on online shopping and entertainment; Cluster 3 (average age 25.2) demonstrates financial stability with balanced expenditures; Cluster 4 (average age 24.9) emphasizes savings and investments with moderate living costs; and Cluster 5 (average age 24.96) combines strong saving habits with balanced essential and leisure spending. Model performance was assessed using the Davies-Bouldin Index, Silhouette Score, and Calinski-Harabasz Index to ensure comprehensive evaluation of cluster quality. The findings highlight the diverse spending behaviors of Generation Z, offering valuable insights for businesses, policymakers, and financial service providers to develop targeted strategies aligned with each segment’s characteristics.
Transfer Learning Analysis on Tuberculosis Classification Using MobileNetV2 Architecture Based on Chest X-Ray Images Latupono, Ali Samsul; Rahardi, Majid
Journal of Applied Informatics and Computing Vol. 9 No. 6 (2025): December 2025
Publisher : Politeknik Negeri Batam

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30871/jaic.v9i6.11510

Abstract

Tuberculosis(TBC) remains a major global health issue, with millions of new cases reported annually. Early and accurate diagnosis is essential, but manual interpretation of chest X-ray(CXR) images is limited by subjectivity and resource constrains. This study applies the MobileNetV2 architecture using transfer learning to classify tuberculosis from CXR images. The publicly available Tuberculosis Chest X-ray dataset containing 4200 images was divided into training (70%), validation (15%), and testing (15%). The pretrained MobileNetV2 model on ImageNet was used as the base network, with additional classification layers and training through the Adam optimizer and early stopping. The model achieved a validation accuracy above 99.84% after the second epoch maintained stable performance. Once the test set, model reached 99.84% accuracy, with precision 99.53% and recall 99.90% for the tuberculosis class. The result demonstrate that the transfer learning with MobileNetV2 provides a fast, efficient, and highly accurate method for tuberculosis detection. This model show potential for integration into Computer-Aided Diagnosis (CAD) system in low resource clinical settings.