Claim Missing Document
Check
Articles

Found 4 Documents
Search

Evaluating Machine Learning Algorithms for Predictive Modeling of Large-scale Event Attendance Nugroho, Deni Kurnianto; Fauzy, Marwan Noor; Hidayat, Kardilah Rohmat
International Journal of Computer and Information System (IJCIS) Vol 6, No 3 (2025): IJCIS : Vol 6 - Issue 3 - 2025
Publisher : Institut Teknologi Bisnis AAS Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.29040/ijcis.v6i3.249

Abstract

Predicting attendance at large-scale public events is a critical task to support better resource planning, logistics, and safety management. This study investigates the performance of various machine learning models in forecasting event attendance using metadata features such as event type, venue, location, date, and duration. The dataset comprises over 19526 event records obtained from a U.S. government open data repository, covering multiple years and diverse event categories. Model performance was evaluated using Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), and the Coefficient of Determination (R²). Among the models tested, ensemble methods particularly Gradient Boosting Regressor and XGBoost outperformed others, achieving the lowest MAE (61.37 and 59.52, respectively) and the highest R² values (0.22 and 0.15). These results suggest superior generalization capability in capturing complex nonlinear patterns in the data. In contrast, linear models and simpler non-parametric methods such as Decision Trees and K-Nearest Neighbors (KNN) exhibited relatively weaker predictive accuracy, with R² scores close to or below 0.14. While the R² values indicate that metadata alone provides a limited view of attendance dynamics, the relatively low MAE across models implies that reasonable point predictions are still achievable. These findings highlight the potential of ensemble-based methods for baseline forecasting tasks. Furthermore, the study underscores the importance of incorporating richer feature sets such as pricing, weather, promotional activity, and social sentiment for future model improvement. This research provides a foundational benchmark for data-driven attendance forecasting and offers practical implications for event organizers seeking scalable, automated prediction tools to support strategic planning.
Evaluating Machine Learning Algorithms for Detecting Online Text-based Fake News Content Nugroho, Deni Kurnianto; Fauzy, Marwan Noor; Hidayat, Kardilah Rohmat
International Journal of Computer and Information System (IJCIS) Vol 6, No 3 (2025): IJCIS : Vol 6 - Issue 3 - 2025
Publisher : Institut Teknologi Bisnis AAS Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.29040/ijcis.v6i3.253

Abstract

The rapid spread of disinformation and fabricated news across online platforms poses a critical risk to informed public engagement and the foundations of democratic governance. This study examines how well different machine learning techniques can classify fake news, using textual features extracted through the Term Frequency–Inverse Document Frequency (TF-IDF) method. The analysis includes five commonly used algorithms like Logistic Regression, Support Vector Machine (SVM), Naive Bayes, Random Forest, and XGBoost. A publicly accessible dataset containing annotated real and fake news articles served as the basis for training and testing these models. Dataset underwent extensive preprocessing, including tokenization, stopword removal, and TF-IDF vectorization, resulting in a sparse high-dimensional matrix of 5068 documents and 39,978 features. Performance evaluation was based on multiple metrics: train/test accuracy, misclassification rate, false positives/negatives, cross-validation mean score, and execution time. Results showed that SVM and Logistic Regression achieved the highest test accuracy (93.61% and 92.27%, respectively) and exhibited robust cross-validation scores, indicating strong generalization ability. In contrast, Naive Bayes produced faster results but suffered from a high false positive rate and lower accuracy (84.77%). Random Forest and XGBoost demonstrated good predictive power but showed signs of overfitting and moderate misclassification rates. These findings suggest that SVM and Logistic Regression are well-suited for fake news detection in textual datasets using TF-IDF features. While traditional models remain effective, future work may explore deep learning approaches and context-aware language models to enhance detection accuracy across more complex and multilingual datasets. This study contributes to the ongoing efforts to combat misinformation through automated, scalable, and interpretable machine learning techniques.
Comparative Sentiment Analysis on News Coverage of AI Risks and Regulation using Rule-based and Transformer-based Models Fauzy, Marwan Noor; Nugroho, Deni Kurnianto; Hidayat, Kardilah Rohmat
International Journal of Computer and Information System (IJCIS) Vol 6, No 3 (2025): IJCIS : Vol 6 - Issue 3 - 2025
Publisher : Institut Teknologi Bisnis AAS Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.29040/ijcis.v6i3.251

Abstract

Rapid development of artificial intelligence technology has raised concerns regarding ethical risks, governance, and the need for adequate regulation. This study aims to analyze the dynamics of public opinion through media coverage of AI risks and regulation. Data were obtained from five major international media outlets (Reuters, Bloomberg, The Guardian, CNBC, and The New York Times) between 2022 and 2025. The analysis process was carried out in several stages: news article extraction, text cleaning, sentiment classification, and trend and distribution visualization. Two approaches were used for sentiment analysis: a rule-based lexical model (VADER) and a contextual transformer model (Multilingual BERT from nlptown). Classification results show that VADER tends to assign neutral labels, while BERT is more sensitive to positive or negative nuances. Correlations between models indicate general trends, but differences emerge during specific periods—particularly during periods of intense coverage of AI policy formulation or ethical incidents. Temporal visualizations show spikes in negative sentiment during the enactment of AI regulations in several countries. This study concludes that the multi-model approach is capable of capturing a broader spectrum of sentiment. Limitations include limited media coverage, potential data bias, and the model's limited ability to understand domain-specific contexts. Recommendations for further study include expanding data sources, using models specifically trained in the AI policy domain, and integrating with entity analysis to uncover dominant actors in public discourse.
Evaluasi Komparatif Hybrid Filtering dan Model-Based SVD pada Sistem Rekomendasi Film Menggunakan Dataset MovieLens Putri Ariani, Angelina; Ayu Handayani, Dita; Muryanti Setyowati, Putri; Nugroho, Deni Kurnianto; Noor Fauzy, Marwan
Jurnal Pustaka Data (Pusat Akses Kajian Database, Analisa Teknologi, dan Arsitektur Komputer) Vol 6 No 1 (2026): Jurnal Pustaka Data (Pusat Akses Kajian Database, Analisa Teknologi, dan Arsitekt
Publisher : Pustaka Galeri Mandiri

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.55382/jurnalpustakadata.v6i1.1723

Abstract

Perkembangan jumlah dan variasi film yang semakin meningkat menyebabkan pengguna mengalami kesulitan dalam menentukan pilihan tontonan yang sesuai dengan preferensi mereka. Oleh karena itu, sistem rekomendasi menjadi solusi penting untuk membantu pengguna memperoleh rekomendasi film yang relevan dan personal. Penelitian ini bertujuan untuk mengimplementasikan dan mengevaluasi kinerja sistem rekomendasi menggunakan pendekatan Item-Based Collaborative Filtering, Content-Based Filtering, Hybrid Filtering, serta Model-Based Collaborative Filtering pada dataset MovieLens 100k. Secara khusus, pendekatan Model-Based menerapkan algoritma Singular Value Decomposition (SVD) yang dioptimalkan dengan learning rate 0.005, regularisasi 0.02, dan 50 faktor laten. Penelitian ini memberikan kontribusi ilmiah berupa analisis komparatif antara pendekatan memory-based dan model-based pada dataset dengan tingkat sparsity tinggi. Proses penelitian meliputi tahapan preprocessing data, pemisahan data menggunakan metode random holdout, dan eksplorasi data untuk menganalisis distribusi rating. Evaluasi kinerja sistem dilakukan menggunakan metrik Root Mean Square Error (RMSE) dan Precision@10. Hasil penelitian menunjukkan bahwa metode Model-Based SVD menghasilkan performa terbaik dengan nilai RMSE terendah sebesar 0.877 dan Precision@10 tertinggi sebesar 67.45%. Sementara itu, metode Hybrid Filtering yang menggunakan skema pembobotan manual menghasilkan kinerja rendah dengan Precision@10 sebesar 0.40%; hal ini disebabkan oleh penggunaan bobot statis yang tidak mampu mengakomodasi variasi bias antar-model secara efektif. Hasil ini menunjukkan bahwa pendekatan berbasis machine learning dengan model laten lebih efektif dalam menangani dataset sparse dibandingkan metode berbasis konten maupun hybrid konvensional dengan bobot statis.