cover
Contact Name
Husni Teja Sukmana
Contact Email
husni@bright-journal.org
Phone
+62895422720524
Journal Mail Official
jads@bright-journal.org
Editorial Address
Gedung FST UIN Jakarta, Jl. Lkr. Kampus UIN, Cemp. Putih, Kec. Ciputat Tim., Kota Tangerang Selatan, Banten 15412
Location
Kota adm. jakarta pusat,
Dki jakarta
INDONESIA
Journal of Applied Data Sciences
Published by Bright Publisher
ISSN : -     EISSN : 27236471     DOI : doi.org/10.47738/jads
One of the current hot topics in science is data: how can datasets be used in scientific and scholarly research in a more reliable, citable and accountable way? Data is of paramount importance to scientific progress, yet most research data remains private. Enhancing the transparency of the processes applied to collect, treat and analyze data will help to render scientific research results reproducible and thus more accountable. The datasets itself should also be accessible to other researchers, so that research publications, dataset descriptions, and the actual datasets can be linked. The journal Data provides a forum to publish methodical papers on processes applied to data collection, treatment and analysis, as well as for data descriptors publishing descriptions of a linked dataset.
Articles 518 Documents
Enhanced Detection of Consumer Behavioral Shifts in E-Commerce Platforms with Transformer-Based Algorithms Syah, Rahmad B.Y; Elveny, Maricha; Darmansyah, Soleh; Silviana, Lia
Journal of Applied Data Sciences Vol 6, No 4: December 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i4.907

Abstract

This research aims to analyze changes in consumer behavior on e-commerce platforms using consumer interaction data such as view, add to cart, and purchase.  Identifying changes in consumer behavior on e-commerce platforms is very important because it can provide deeper insight into consumer motivations and preferences. By better understanding how consumers interact with products, companies can design more targeted strategies to increase conversions, reduce cart abandonment, and improve the overall customer experience. The DistilBERT based prediction model is applied to detect and predict changing patterns of consumer behavior in the purchasing process. DistilBERT was chosen because of its more efficient capabilities compared to previous models which enable faster data processing and lower resource usage, which is very important for real-time applications on e-commerce platforms with big data. The data used includes consumer interactions during a certain period, with model evaluation using precision, recall, F1-score, and accuracy metrics. The results showed that despite an increase in the number of actions such as View and Add to Cart, conversion to Purchase was still hampered, indicating a cart abandonment problem. The model used managed to achieve 90% accuracy, with a precision value of 0.87, recall of 0.85, and F1-score of 0.86, showing excellent performance in predicting changes in consumer behavior. Based on the results of this analysis, companies can optimize marketing strategies by targeting consumers who have added products to their basket but have not yet made a purchase, as well as making price adjustments, discounts, and limited time offers. This research also emphasizes the importance of using real-time data to dynamically adjust marketing strategies and improve customer experience.
Clustering-Based Adaptive UX in E-Learning Systems: Aligning Microservices with the 4C Framework Belluano, Poetri Lestari Lokapitasari; Patmanthara, Syaad; Ashar, Muhammad; Kurniawan, Fachrul; Kurubacak, Gulsun
Journal of Applied Data Sciences Vol 6, No 4: December 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i4.884

Abstract

This study introduces a clustering-driven adaptive User Experience (UX) architecture for e-learning systems, aligning machine learning segmentation with the 21st-century 4C educational framework (critical thinking, communication, collaboration, creativity). The objective is to dynamically personalize digital learning interactions through a microservices architecture responsive to users' UX profiles. A quantitative survey was conducted involving 50 active users of Shopee and Tokopedia, whose interaction feedback was mapped using the User Experience Questionnaire (UEQ). Three unsupervised clustering techniques—KMeans, Agglomerative, and DBSCAN—were compared. KMeans outperformed the others with a silhouette score of 0.157, compared to 0.146 for Agglomerative and −0.017 for DBSCAN, identifying three meaningful clusters representing high, medium, and low UX proficiency. A one-way ANOVA test confirmed statistically significant differences (p 0.01) among the clusters in dimensions such as error clarity, support responsiveness, and user confidence. These UX profiles were then mapped to individualized microservices: Cluster 0 received autonomous content with minimal support, Cluster 1 was offered guided prompts, and Cluster 2 was provided with simplified interfaces and proactive assistance. Each cluster was aligned with specific 4C competencies to ensure pedagogical relevance. The proposed architecture, built with gRPC-based microservices, enabled asynchronous, low-latency personalization based on user cluster membership. The novelty of this research lies in its dual alignment—technological (microservices + machine learning) and educational (4C competency mapping)—to construct a scalable and responsive e-learning environment. The system design, although validated through simulation, demonstrates a practical foundation for future deployment in platforms like Moodle or OpenEdX. By linking behavioral UX clustering to pedagogical intervention strategies, this study offers a model for adaptive, data-informed instructional systems that are both scalable and learner-centered.
Progressive Massive Fibrosis Detection Using Generative Adversarial Networks and Long Short-Term Memory Irianto, Suhendro Y.; Karnila, Sri; Hasibuan, M.S.; Dewi, Deshinta Arrova; Kurniawan, Tri Basuki; Kurniawan, Hendra
Journal of Applied Data Sciences Vol 6, No 4: December 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i4.707

Abstract

Contribution: Progressive Massive Fibrosis (PMF) is a severe form of pneumoconiosis, affecting individuals exposed to mineral dust, such as coal miners and workers in the artificial stone industry. This condition causes significant pulmonary impairment and increased mortality. Early and accurate detection is vital for effective management, yet traditional diagnostic methods face challenges in differentiating PMF from other pulmonary diseases due to variability in clinical presentations and limitations in imaging techniques. Idea: The study introduces a novel diagnostic framework that integrates Generative Adversarial Networks (GAN) and Long Short-Term Memory (LSTM) networks to enhance the detection and monitoring of PMF. The GAN generates high-fidelity synthetic imaging data to address the issue of limited datasets, while the LSTM network captures temporal patterns in patient data, enabling real-time monitoring of disease progression. Objective: The primary objective of this research is to develop an AI-driven model that improves the accuracy and efficiency of PMF detection and monitoring, facilitating early diagnosis and better treatment planning. Findings: The integrated GAN-LSTM model significantly outperformed traditional diagnostic methods. It proved high accuracy, a Dice coefficient of 0.85, and an Area Under the Curve (AUC) of 0.92, showing precise differentiation of PMF from other pulmonary conditions, such as lung cancer and tuberculosis. Results: The GAN-LSTM framework achieved an accuracy of 91.3%, suggesting that the fusion of GAN and LSTM technologies can effectively address the challenges of limited datasets and heterogeneous disease progression. The model showed promise in enhancing the non-invasive detection and ongoing monitoring of PMF. Novelty: This research stands for a significant advancement in PMF diagnostics by combining GAN and LSTM technologies in a single framework. This approach improves diagnostic accuracy and eases continuous disease monitoring, offering a non-invasive and highly precise solution for PMF detection.
A Data-Driven Training Kit to Enhance the Note Recording Skills of Music Learners Chantanasut, Thaworada
Journal of Applied Data Sciences Vol 6, No 4: December 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i4.1063

Abstract

Music note recording is a fundamental skill in music education, yet many undergraduate learners struggle due to the lack of structured, self-guided practice resources. This study aimed to develop and evaluate a data-driven training kit designed to Enhance the Note Recording Skills of Music Learners. A total of 20 first-year students from a Bachelor of Education program in Western Music and Vocal Education were selected through purposive sampling. Research instruments included the training kit, expert evaluation forms, pre- and post-tests, student satisfaction surveys, and behavior observation checklists. Quantitative data were analyzed using mean, standard deviation, and paired sample t-tests. Results showed a statistically significant improvement in students' performance after the intervention (t = 18.789, p .001), with a large effect size (Cohen’s d = 2.53). Experts rated the kit highly (M = 4.86, SD = 0.29), and students reported very high satisfaction (M = 4.94, SD = 0.14). These findings support the kit’s effectiveness as an engaging and pedagogically sound tool for developing music note recording skills in higher education settings. 
A Hybrid LSTM–Stacking–SMOTE Model for Weather-Aware Palm Oil Price Prediction Addressing Data Imbalance and Forecast Accuracy Kusmanto, Kusmanto; Subagio, S; Manja, Erni
Journal of Applied Data Sciences Vol 6, No 4: December 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i4.922

Abstract

Accurate forecasting of palm oil prices is crucial for agribusiness decision-making due to high market volatility influenced by dynamic weather conditions. This study proposes a novel hybrid deep learning model combining Long Short-Term Memory (LSTM), Stacking Ensemble, and Synthetic Minority Over-sampling Technique (SMOTE) to improve predictive accuracy and handle class imbalance in price trend classification. The model was trained using a multivariate time-series dataset sourced from Kaggle, consisting of daily records of temperature, humidity, rainfall, and palm oil prices. A binary classification scheme was applied by labeling instances as either price increase (class 1) or price stable/decrease (class 0), based on a 0% price change threshold. Four experimental configurations were evaluated: standard LSTM, LSTM + SMOTE, LSTM + Stacking, and the proposed LSTM + SMOTE + Stacking. The proposed model outperformed all baselines, achieving the highest accuracy of 83.12%, an F1-score of 0.8466, MAE of 0.1688, RMSE of 0.4109, and a perfect recall of 1.0000, indicating excellent sensitivity to minority class trends. In contrast, the standard LSTM achieved only 77.32% accuracy and an F1-score of 0.7224, showing limited ability in handling imbalanced data. Visualization of loss curves and confusion matrices confirmed the model’s learning stability and classification effectiveness. This study contributes a novel integration of ensemble learning and oversampling in time-series commodity forecasting and demonstrates the effectiveness of this approach in capturing weather-driven price patterns, offering a robust framework for predictive analytics in agriculture.
Optimizing The XGBoost Model with Grid Search Hyperparameter Tuning for Maximum Temperature Forecasting Sugiarto, Sugiarto; Mas Diyasa, I Gede Susrama; Alhamda, Denisa Septalian; Aryananda, Rangga Laksana; Fatmah Sari, Allan Ruhui; Sukri, Hanifudin; Dewi, Deshinta Arrowa
Journal of Applied Data Sciences Vol 6, No 4: December 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i4.885

Abstract

This study presents a novel comparative approach to maximum temperature forecasting in Surabaya, Indonesia, by integrating Extreme Gradient Boosting (XGBoost) with Grid Search Hyperparameter Tuning and benchmarking it against Autoregressive Integrated Moving Average (ARIMA) and Neural Prophet models. The main idea is to evaluate the capability of XGBoost in capturing nonlinear patterns in environmental time series data, which traditional models often fail to address. Using 15,388 historical daily maximum temperature records from the BMKG Juanda weather station spanning 1981–2022, the objective is to identify the most accurate predictive model for short- and medium-term forecasts. The modeling process involved four stages: data acquisition, preprocessing, training, and evaluation, with performance assessed using Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE). The findings show that, after hyperparameter tuning, XGBoost achieved the best performance with MAE = 0.32 and RMSE = 0.65, outperforming ARIMA (MAE = 0.85, RMSE = 1.20) and Neural Prophet (MAE = 0.70, RMSE = 0.98). Prediction results for 2025 indicate peak maximum temperatures in January, October, and November, aligning with recent climate patterns. The contribution of this research lies in demonstrating the superiority of a tuned XGBoost model for complex environmental datasets, offering a practical tool for urban climate planning, agricultural scheduling, and heatwave risk mitigation. The novelty of this work is the systematic integration of Grid Search-based optimization with XGBoost for meteorological forecasting in a tropical urban context, producing higher accuracy than both classical statistical and modern hybrid time series methods. These results highlight the model’s adaptability and potential for broader climate-related applications, with future research recommended to incorporate additional meteorological variables such as humidity and wind speed for even greater predictive capability.
Spatio-Temporal Variation in Bus Service Satisfaction and Policy Implications: A Case-Based Study in an Emerging Urban Area Nga, Nguyen Thi Ngoc; Minh, Hai Nguyen
Journal of Applied Data Sciences Vol 6, No 4: December 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i4.879

Abstract

This study investigates the spatio-temporal variation in bus service satisfaction in a rapidly urbanizing city in Southeast Asia. The research addresses a critical gap in urban transit studies, where passenger satisfaction is often treated as a static construct, overlooking how satisfaction may fluctuate across different geographic areas and time periods. By applying a spatio-temporal analytical framework, this study aims to provide a more dynamic and localized understanding of transit service perceptions. The research builds upon existing approaches by integrating a structured survey based on the Customer Satisfaction Survey (CSS) method with Best–Worst Scaling (BWS) to prioritize service attributes. A stratified sampling technique was employed across multiple wards in the study area, with 612 valid responses collected during both peak and off-peak hours. The survey captured data on passenger experiences and preferences, disaggregated by time of day, location, and demographic characteristics. Multinomial Logit (MNL) modeling was used to estimate the relative importance of key service dimensions, such as punctuality, comfort, frequency, and accessibility. The analysis revealed significant spatial and temporal heterogeneity in satisfaction levels. For instance, passengers in peripheral wards rated reliability and onboard conditions more negatively compared to those in central areas. Similarly, satisfaction levels were lower during evening hours, particularly concerning bus overcrowding and wait times. The findings suggest that transit policy must adopt a more flexible and localized strategy, rather than uniform service standards, to address distinct user expectations. Targeted improvements in underperforming routes and time slots could enhance overall user experience and promote public transport usage. This study contributes new insights to the evaluation of urban bus services in emerging cities and underscores the value of incorporating spatio-temporal dynamics into transit planning and customer satisfaction research.
An Explainable Credit Card Fraud Detection Model using Machine Learning and Deep Learning Approaches Alkhozae, Mona; Almasre, Miada; Almakky, Abeer; Alhebshi, Reemah M.; Alamri, Amani; Hakami, Widad; Alshahrani, Lamia
Journal of Applied Data Sciences Vol 6, No 4: December 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i4.962

Abstract

This study proposes an adaptive, interpretable real-time fraud detection and prevention system designed for high-risk financial environments, capable of processing over 1.6 million imbalanced credit card transactions with low latency. The objective is to build a unified framework that integrates predictive accuracy, explainability, and adaptability. The methodology follows four phases: exploratory data analysis to reveal structural and behavioral fraud patterns, feature engineering with domain-informed attributes and ADASYN oversampling to mitigate the 1:174 imbalance, training of multiple models (XGBoost, LightGBM, Random Forest, Gradient Boosting, and MLP), and an ensemble architecture evaluated with SHAP-based explainability. The system introduces three key contributions: stability-aware SHAP caching that reduces explanation latency to 41.2 ms, reinforcement learning–based threshold tuning that dynamically adapts to evolving fraud patterns, and out-of-distribution detection to enhance resilience against data drift. Results demonstrate strong performance, with XGBoost achieving 99.86% accuracy, 96.36% precision, 80.59% recall, F1-score of 0.878, and ROC-AUC of 0.9988, outperforming other models. The full system attained 93.2% accuracy, 90.2% F1-score, and 96.1% AUC at the system level, successfully blocking 91% of fraudulent transactions while maintaining a false positive rate of 7.8%. Novelty lies in combining explainability and adaptivity in a production-ready architecture, where reinforcement learning enables continuous threshold self-regulation and SHAP stability analysis validates interpretability across models. These findings show that high fraud detection accuracy and transparency are not mutually exclusive, offering a scalable blueprint for financial institutions and other critical domains requiring real-time, explainable, and adaptive decision-making.
Comparative Analysis of Novel Deep Reinforcement Learning Methods for Food Distribution Optimization Hutahaean, Jeperson; Siagian, Yessica; Saputra, Endra
Journal of Applied Data Sciences Vol 6, No 4: December 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i4.956

Abstract

Uneven food distribution across various regions in Indonesia often results in supply-demand imbalances, leading to price surges, stock shortages, and overall market instability. This challenge is compounded by the limitations of conventional distribution systems, which are ill-equipped to respond to rapidly changing market dynamics. In response, this study introduces a novel, AI-driven approach by implementing Deep Reinforcement Learning (DRL) to optimize food distribution policies using real-world data. Specifically, we perform a comparative evaluation of four emerging DRL models—Double Deep Q-Network (Double DQN), Dueling DQN, Proximal Policy Optimization (PPO), and Advantage Actor-Critic (A2C)—to determine their effectiveness in learning adaptive distribution strategies from national food logistics data provided by Indonesia’s Central Bureau of Statistics (BPS). Each model was trained within a custom simulation environment based on the Markov Decision Process (MDP) framework and evaluated using five core performance metrics: cumulative reward, average reward, success rate, sample efficiency, and best reward. The results reveal that A2C consistently outperformed the other models, delivering the highest average reward and most stable training performance, while PPO demonstrated strong efficiency and success rate. These findings underscore the potential of policy-gradient methods—particularly A2C—as robust and intelligent solutions for dynamic food logistics management. This research offers one of the first comparative benchmarks of DRL methods in the food distribution domain and highlights their applicability for future integration into national AI-powered logistics systems.
An Artificial Neural Network-Based Geo-Spatial Model for Real-Time Flood Risk Prediction Using Multi-Source High-Resolution Data Aziz, RZ Abdul; Nurpambudi, Ramadhan; Herwanto, Riko; Hasibuan, Muhammad Said
Journal of Applied Data Sciences Vol 6, No 4: December 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i4.913

Abstract

Flood prediction presents a pressing challenge in disaster management, especially in regions vulnerable to extreme weather events. In response, this study offers a novel approach to flood risk prediction by developing a deep learning-based Geo-Spatial Artificial Neural Network (ANN). The model actively integrates high-resolution satellite imagery, meteorological data, and topographic indicators, such as rainfall, elevation, and land use to capture complex spatial and environmental relationships that influence flood risk. This study conducted data preprocessing using Principal Component Analysis (PCA) and normalization to ensure consistency across datasets. It built the ANN with multiple hidden layers and trained it using the backpropagation algorithm on historical flood data. Furthermore, it designed the ANN model with multiple hidden layers and trained it using the backpropagation algorithm. The model achieved a notable 92% prediction accuracy, significantly outperforming traditional flood prediction methods, which typically yield 75–85% accuracy. Conventional metrics were Mean Squared Error (1.41) and R-squared (0.94). It confirmed the model’s superior ability to predict high-risk flood zones. The model also effectively captured non-linear patterns that conventional statistical or deterministic methods often failed to detect. The results showed that the model generalizes well and adapts effectively, making it suitable for real-time and data-driven flood forecasting. By integrating artificial intelligence with geo-spatial analytics, this study offers a scalable, accurate, and efficient tool for early warning systems and risk management. It recommends that future research should focus on incorporating additional data sources and refining model training techniques to further enhance scalability and performance.