Claim Missing Document
Check
Articles

Found 3 Documents
Search
Journal : International Journal of Computer and Information System (IJCIS)

Evaluating Machine Learning Algorithms for Predictive Modeling of Large-scale Event Attendance Nugroho, Deni Kurnianto; Fauzy, Marwan Noor; Hidayat, Kardilah Rohmat
International Journal of Computer and Information System (IJCIS) Vol 6, No 3 (2025): IJCIS : Vol 6 - Issue 3 - 2025
Publisher : Institut Teknologi Bisnis AAS Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.29040/ijcis.v6i3.249

Abstract

Predicting attendance at large-scale public events is a critical task to support better resource planning, logistics, and safety management. This study investigates the performance of various machine learning models in forecasting event attendance using metadata features such as event type, venue, location, date, and duration. The dataset comprises over 19526 event records obtained from a U.S. government open data repository, covering multiple years and diverse event categories. Model performance was evaluated using Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), and the Coefficient of Determination (R²). Among the models tested, ensemble methods particularly Gradient Boosting Regressor and XGBoost outperformed others, achieving the lowest MAE (61.37 and 59.52, respectively) and the highest R² values (0.22 and 0.15). These results suggest superior generalization capability in capturing complex nonlinear patterns in the data. In contrast, linear models and simpler non-parametric methods such as Decision Trees and K-Nearest Neighbors (KNN) exhibited relatively weaker predictive accuracy, with R² scores close to or below 0.14. While the R² values indicate that metadata alone provides a limited view of attendance dynamics, the relatively low MAE across models implies that reasonable point predictions are still achievable. These findings highlight the potential of ensemble-based methods for baseline forecasting tasks. Furthermore, the study underscores the importance of incorporating richer feature sets such as pricing, weather, promotional activity, and social sentiment for future model improvement. This research provides a foundational benchmark for data-driven attendance forecasting and offers practical implications for event organizers seeking scalable, automated prediction tools to support strategic planning.
Evaluating Machine Learning Algorithms for Detecting Online Text-based Fake News Content Nugroho, Deni Kurnianto; Fauzy, Marwan Noor; Hidayat, Kardilah Rohmat
International Journal of Computer and Information System (IJCIS) Vol 6, No 3 (2025): IJCIS : Vol 6 - Issue 3 - 2025
Publisher : Institut Teknologi Bisnis AAS Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.29040/ijcis.v6i3.253

Abstract

The rapid spread of disinformation and fabricated news across online platforms poses a critical risk to informed public engagement and the foundations of democratic governance. This study examines how well different machine learning techniques can classify fake news, using textual features extracted through the Term Frequency–Inverse Document Frequency (TF-IDF) method. The analysis includes five commonly used algorithms like Logistic Regression, Support Vector Machine (SVM), Naive Bayes, Random Forest, and XGBoost. A publicly accessible dataset containing annotated real and fake news articles served as the basis for training and testing these models. Dataset underwent extensive preprocessing, including tokenization, stopword removal, and TF-IDF vectorization, resulting in a sparse high-dimensional matrix of 5068 documents and 39,978 features. Performance evaluation was based on multiple metrics: train/test accuracy, misclassification rate, false positives/negatives, cross-validation mean score, and execution time. Results showed that SVM and Logistic Regression achieved the highest test accuracy (93.61% and 92.27%, respectively) and exhibited robust cross-validation scores, indicating strong generalization ability. In contrast, Naive Bayes produced faster results but suffered from a high false positive rate and lower accuracy (84.77%). Random Forest and XGBoost demonstrated good predictive power but showed signs of overfitting and moderate misclassification rates. These findings suggest that SVM and Logistic Regression are well-suited for fake news detection in textual datasets using TF-IDF features. While traditional models remain effective, future work may explore deep learning approaches and context-aware language models to enhance detection accuracy across more complex and multilingual datasets. This study contributes to the ongoing efforts to combat misinformation through automated, scalable, and interpretable machine learning techniques.
Comparative Sentiment Analysis on News Coverage of AI Risks and Regulation using Rule-based and Transformer-based Models Fauzy, Marwan Noor; Nugroho, Deni Kurnianto; Hidayat, Kardilah Rohmat
International Journal of Computer and Information System (IJCIS) Vol 6, No 3 (2025): IJCIS : Vol 6 - Issue 3 - 2025
Publisher : Institut Teknologi Bisnis AAS Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.29040/ijcis.v6i3.251

Abstract

Rapid development of artificial intelligence technology has raised concerns regarding ethical risks, governance, and the need for adequate regulation. This study aims to analyze the dynamics of public opinion through media coverage of AI risks and regulation. Data were obtained from five major international media outlets (Reuters, Bloomberg, The Guardian, CNBC, and The New York Times) between 2022 and 2025. The analysis process was carried out in several stages: news article extraction, text cleaning, sentiment classification, and trend and distribution visualization. Two approaches were used for sentiment analysis: a rule-based lexical model (VADER) and a contextual transformer model (Multilingual BERT from nlptown). Classification results show that VADER tends to assign neutral labels, while BERT is more sensitive to positive or negative nuances. Correlations between models indicate general trends, but differences emerge during specific periods—particularly during periods of intense coverage of AI policy formulation or ethical incidents. Temporal visualizations show spikes in negative sentiment during the enactment of AI regulations in several countries. This study concludes that the multi-model approach is capable of capturing a broader spectrum of sentiment. Limitations include limited media coverage, potential data bias, and the model's limited ability to understand domain-specific contexts. Recommendations for further study include expanding data sources, using models specifically trained in the AI policy domain, and integrating with entity analysis to uncover dominant actors in public discourse.