cover
Contact Name
Husni Teja Sukmana
Contact Email
husni@bright-journal.org
Phone
+62895422720524
Journal Mail Official
jads@bright-journal.org
Editorial Address
Gedung FST UIN Jakarta, Jl. Lkr. Kampus UIN, Cemp. Putih, Kec. Ciputat Tim., Kota Tangerang Selatan, Banten 15412
Location
Kota adm. jakarta pusat,
Dki jakarta
INDONESIA
Journal of Applied Data Sciences
Published by Bright Publisher
ISSN : -     EISSN : 27236471     DOI : doi.org/10.47738/jads
One of the current hot topics in science is data: how can datasets be used in scientific and scholarly research in a more reliable, citable and accountable way? Data is of paramount importance to scientific progress, yet most research data remains private. Enhancing the transparency of the processes applied to collect, treat and analyze data will help to render scientific research results reproducible and thus more accountable. The datasets itself should also be accessible to other researchers, so that research publications, dataset descriptions, and the actual datasets can be linked. The journal Data provides a forum to publish methodical papers on processes applied to data collection, treatment and analysis, as well as for data descriptors publishing descriptions of a linked dataset.
Articles 518 Documents
Fine-Grained Sentiment Analysis Approach on Customer Reviews Based on Aspect-Level Emotion Detection Paramita, Adi Suryaputra; Jusak, Jusak
Journal of Applied Data Sciences Vol 6, No 3: September 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i3.964

Abstract

In the era of digital platforms, customer reviews constitute a vital resource for understanding user sentiment and perception toward products and services. Traditional sentiment analysis methods predominantly operate at the document or sentence level, often missing fine-grained emotional cues tied to specific product or service aspects. To address this limitation, this study proposes a novel Fine-Grained Sentiment Analysis (FGSA) framework that performs aspect-level sentiment classification using a joint learning approach. The proposed model employs a hybrid deep learning architecture that integrates transformer-based contextual encoders with Bidirectional Long Short-Term Memory (Bi-LSTM) layers. This design allows the model to capture both rich contextual semantics and sequential dependencies a combination that has not been widely adopted in existing FGSA research. Additionally, we introduce a new annotated dataset of 5,000 customer reviews spanning multiple domains (electronics, food and beverages, and general services), enabling robust training and evaluation. Experimental results show that the model outperforms standard baselines, achieving an F1-score of 82.0% for aspect extraction and an accuracy of 79.8% for sentiment classification. Further analysis reveals consistent patterns, such as positive sentiments linked to design and quality, and negative sentiments associated with customer service and delivery. These insights highlight the practical value of aspect-level sentiment modelling. The key contribution of this work is the integration of a transformer-Bi-LSTM joint architecture for aspect-based sentiment analysis, supported by a domain-diverse benchmark dataset. This framework enhances the interpretability and granularity of sentiment insights and sets a foundation for future research in multilingual and multimodal contexts.
Lora Communication System for Early Detection and Monitoring of Water Toxicity in Floating Net Cages Rahmafadilla, Rahmafadilla; Irawati, Indrarini Dyah; Rizal, Mochammad Fahru; Maidin, Siti Sarah
Journal of Applied Data Sciences Vol 6, No 3: September 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i3.787

Abstract

Floating Net Cages/ Keramba Jaring Apung (KJA) are at risk of polluting the air, which can affect fish farming. Therefore, an early monitoring system is needed that can measure air quality such as temperature, pH, and dissolved oxygen (DO) in real-time. This system utilizes the LoRa RFM95W module to wirelessly transmit environmental data from sensors installed on the cages, which continuously monitor water quality parameters such as temperature, pH, and DO in real-time. The data obtained is then processed to monitor changes in water toxicity in real-time, allowing early detection of potential threats to the ecosystem. Tests were conducted at distances of 50m, 180m, 300m, 340m, and 440m. The results showed that the system worked well up to a distance of 300m with RSSI values between -85 dBm to -120 dBm and SNR more than 2 dB. However, at distances of 340m and 440m, the signal decreased and the delay increased. At a depth of 340m, only one experiment was successful with RSSI -134 dBm and SNR -6 dB, while at a depth of 440m, only a few experiments were successful with RSSI between -122 dBm to -132 dBm and SNR between 1 dB to -6 dB. The prototype system successfully transmitted real-time air quality data to a web-based monitoring center. Data from the sensors were sent via the LoRa network to a central server for further monitoring.
Optimizing Function-Level Source Code Classification Using Meta-Trained CodeBERT in Low-Resource Settings Septiadi, Abednego Dwi; Prasetyo, Muhamad Awiet Wiedanto; Daffa, Geusan Edurais Aria
Journal of Applied Data Sciences Vol 6, No 3: September 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i3.902

Abstract

This study investigates the effectiveness of a meta-trained transformer-based model, CodeBERT, for classifying source code functions in environments with limited labeled data. The primary objective is to improve the accuracy and generalizability of function-level code classification using few-shot learning, a strategy where the model learns from only a few labeled examples per category. We introduce a meta-learning framework designed to enable CodeBERT to adapt to new function types with minimal supervision, addressing a common limitation in traditional code classification methods that require extensive labeled datasets and manual feature engineering. The methodology involves episodic few-shot classification, where each episode simulates a low-resource task using five labeled and five unlabeled samples per function class. A balanced subset of Python functions was sampled from the CodeXGLUE benchmark, consisting of ten function categories with equal representation. The source code was preprocessed by removing comments and docstrings, then tokenized into a fixed length of 128 tokens to fit the model input format. The meta-trained CodeBERT was evaluated across 10 episodes, each representing a different task composition. Results show that the model achieves an average classification accuracy of 73.0%, with high accuracy on function categories characterized by unique syntax patterns, and lower performance on categories with overlapping logic or naming structures. Despite this variability, the model-maintained accuracy above 60% in all episodes. These findings suggest that meta-learning significantly enhances the adaptability of CodeBERT to unseen tasks under data-constrained conditions. This research demonstrates that meta-trained transformer models can serve as practical tools for real-time code analysis, particularly in integrated development environments and continuous integration pipelines. Future work may include extending the framework to other programming languages and incorporating semantic code representations to further reduce classification ambiguity.
YOLOv12 Model Optimization for Monitoring Occupational Health and Safety in Hospital Archive Rooms Jepisah, Doni; Octaria, Haryani; Muhamadiah, Muhamadiah; Irawan, Yuda
Journal of Applied Data Sciences Vol 6, No 4: December 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i4.936

Abstract

The application of artificial intelligence technology in occupational safety monitoring systems within healthcare facilities has become an urgent necessity, particularly to support compliance with Occupational Safety and Health (OSH) standards in hospitals. This study aims to develop an automated detection model based on YOLOv12 to identify visual OSH elements in hospital archive rooms, such as APAR, evacuation signs, windows, and Personal Protective Equipment (PPE) including masks, gloves, and shoes. The initial dataset consisted of 2,866 documented images, which were expanded through augmentation to 6,886 images to increase data diversity and prevent overfitting. The YOLOv12 model was trained over 100 epochs using SGD as the optimization technique. The dataset was divided into three subsets training, validation, and testing in a proportional manner. Model evaluation employed metrics such as precision, recall, mAP@0.5, and mAP@0.5–0.95, supported by visualizations including the confusion matrix, F1-confidence curve, and precision-recall curve. One of the key advantages of YOLOv12 lies in its architectural efficiency and enhanced generalization capability, enabled by the integration of R-ELAN, Area Attention Mechanism, and FlashAttention. These components allow for broader receptive field processing with reduced computational complexity. Furthermore, the removal of positional encoding and adjustment of the MLP ratio make the model lighter and faster without compromising accuracy. Compared to previous versions (YOLOv8–YOLOv11), YOLOv12 demonstrates more stable and accurate performance in detecting complex OSH objects in indoor environments. The system was also implemented in a real-time user interface using Streamlit, automatically displaying personnel PPE completeness and room safety compliance status. In conclusion, the optimized YOLOv12 model has proven effective for real-time visual detection in OSH contexts. Future studies are recommended to incorporate data balancing approaches, spatial segmentation, and IoT sensor integration to expand the system’s coverage and resilience across diverse workplace conditions.
Moodle-based Blended Learning: Factors Influencing the Behavioral Intention of Undergraduate Students Khazalah, Fayez; Alrababah, Saif Addeen; Mansour, Ayman; Alafif, Tarik
Journal of Applied Data Sciences Vol 6, No 4: December 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i4.888

Abstract

The demand for blended learning by higher education has increased since COVID-19. Blended learning combines the advantages of both face-to-face and online learning. Many HEIs in developing countries have started to depend on Moodle to offer blended courses to their students, as it is freely available and open source. The current study aims to explore the factors that influence the intentions to use Moodle-based Blended Learning (MBBL) by higher education students in a public university in Jordan, a developing country. For this purpose, we used a modified version of the UTAUT2 model. Data were gathered through a survey that targeted undergraduate students. The study used 319 valid response samples and analyzed the data using SmartPLS 4 software that implements PLS-SEM analysis. The data analysis results show that the factors that influence the students’ behavioral intention to use MBBL are performance expectancy (β = .18), effort expectancy (β = .21), social influence (β = .16), and habit (β = .25). However, the results indicate that facilitating conditions and hedonic motivation factors do not have a significant influence. In addition, the results reveal that result demonstrability has significant effect on both performance expectancy (β = .58) and effort expectancy (β = .52). Also, effort expectancy is found to influence performance expectancy (β = .17). Among the influential factors, habit is identified as the strongest predictor of intentions followed by effort expectancy, whereas social influence is the weakest predictor. The proposed model was able to explain 50% of variance in students’ intentions to use MBBL. The current study provides HEIs with valuable insights needed to improve the MBBL process and enhance the performance of students. It also suggests future research directions that build on this study to reach more generalized and stable results.
Digital Platform Utilization and ICT Literacy on Global Market Access among MSMEs: The Mediating Role of Digital Business Readiness and the Moderating Effect of Government Support Nurfaizal, Yusmedi; Kurniawan, Arief Adhy; Hermawan, Hellik; Surya Saputra, Dhanar Intan; Amalina, Siti Nahla; Hafshah, Luqyana Nida
Journal of Applied Data Sciences Vol 6, No 4: December 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i4.893

Abstract

This study investigates how digital capabilities and institutional support shape global market access among Micro, Small, and Medium Enterprises (MSMEs) in Indonesia, with a specific focus on Banyumas Regency. The research integrates technology adoption, organizational readiness, and policy support into a unified structural model to explain MSME internationalization in semi-urban contexts. The objective is to assess the direct effects of digital platform utilization and ICT literacy on global market access, the mediating role of digital business readiness, and the moderating effect of government support. A survey of 125 digitally engaged MSMEs was analyzed using Partial Least Squares Structural Equation Modeling. The findings reveal that digital platform utilization (β = 0.163; p = 0.008) and ICT literacy (β = 0.161; p = 0.004) significantly and directly enhance global market access. Both constructs also positively influence digital business readiness digital platform utilization (β = 0.600; p 0.001) and ICT literacy (β = 0.497; p 0.001) which itself contributes to market access (β = 0.154; p = 0.037). Mediation analysis confirms that digital business readiness significantly mediates the relationship between digital capabilities and global market access, with indirect effects of 0.093 (p = 0.045) and 0.077 (p = 0.035) for digital platform utilization and ICT literacy, respectively. Furthermore, government support significantly moderates the effect of digital readiness on market access (β = 0.222; p = 0.002). The model demonstrates strong explanatory power (R² = 0.782 for global market access) and predictive relevance (Q² = 0.507). This study contributes to the digital transformation literature by positioning digital business readiness as a critical enabler of MSME internationalization and highlights the synergistic role of government interventions in amplifying internal digital capabilities. The novelty lies in applying an integrated model to a semi-urban developing economy setting, offering insights for inclusive digital ecosystem design and policy formulation.
Explaining Students' Digital Entrepreneurial Behavior: The Role of Social Media Adoption in an Integrated TPB–UTAUT Model Lestari, Elissa Dwi; Kurniasari, Florentina; Natania, Davina; Kurniawan, Alvin Yuan; Budiyanto, Hendro
Journal of Applied Data Sciences Vol 6, No 4: December 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i4.965

Abstract

Amidst digital transformation and demographic bonuses in Indonesia, the emergence of digital entrepreneurship among the younger generation has become a promising yet challenging phenomenon. The main objective of this study is to develop and empirically evaluate an integrated model that explains students' digital entrepreneurial behavior by integrating psychological and technological viewpoints and combining the Theory of Planned Behavior (TPB) and the Unified Theory of Acceptance and Use of Technology (UTAUT) approaches. TPB has been widely used to predict entrepreneurial intentions and behavior. However, TPB is not yet considered to be able to capture the role of comprehensive technology adoption in the context of digital entrepreneurship. To bridge this gap, this study integrates the UTAUT approach, which focuses on technology acceptance factors. This integration addresses the shortcoming of the TPB by completely including the impact of digital technology adoption on entrepreneurship, while the UTAUT fails to include psychological motivation. PLS-SEM analyzed data from 322 student entrepreneurs who run social media-based enterprises. The study found that the TPB-UTAUT framework explains 62.2% of the variation in social media adoption (R² = 0.622) and 62.6% of the variance in entrepreneurial activity (R² = 0.626). Eight out of nine hypotheses were supported: attitudes (β = 0.330, p 0.001) and perceived behavioral control (β = 0.189, p = 0.008) significantly influenced social media adoption, while attitudes (β = 0.155, p = 0.006), perceived behavioral control (β = 0.295, p 0.001), performance expectancy (β = 0.149, p = 0.011), and social media adoption (β = 0.225, p = 0.001) directly enhanced entrepreneurial behavior. Effort expectation influenced adoption (β = 0.183, p = 0.005) but not behavior (β = 0.101, p = 0.069). The novelty of this study lies in demonstrating that among digital-native students, effort expectancy loses significance in predicting entrepreneurial behavior, indicating a generational shift in technology adoption dynamics. These insights offer theoretical enrichment and practical implications for designing digital entrepreneurship curricula and policies in developing countries.
Clustering Cadet Training Performance Using K-Means and Ward's Method Evidence from OTMon Maritime Monitoring System Wahdiana, Dian; Ekohariadi, Ekohariadi; Suhartini, Ratna
Journal of Applied Data Sciences Vol 6, No 4: December 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i4.958

Abstract

This study investigates cadet performance segmentation during on-board maritime training using clustering analysis of data from the On Training Monitoring (OTMon) system. Grounded in the competency-based education framework and experiential learning theory, the research aims to identify behavioral patterns and competency levels among 80 maritime cadets over a twelve-month sea-based training program. The OTMon application continuously recorded task completion rates, feedback interactions, sign-on consistency, and report submissions. K-Means clustering and Principal Component Analysis (PCA) revealed three distinct cadet profiles: Cluster 1 (high-performing) with average task completion of 92.4% and feedback frequency of 15.2 times/month; Cluster 2 (administratively consistent) with 88.1% completion but only 6.3 feedback interactions/month; and Cluster 3 (at-risk) with 67.5% completion and 3.8 feedback interactions/month. Linear Discriminant Analysis (LDA) validated the clusters with 98.8% resubstitution accuracy and 97.6% cross-validation accuracy, supported by generalized squared distances above 9.5 between all cluster pairs, indicating strong separation. These findings demonstrate that unsupervised clustering can reliably distinguish high-performing cadets from those needing targeted intervention, enabling data-informed mentoring and adaptive learning strategies in maritime education. The contribution of this study lies in integrating digital monitoring data with both unsupervised and supervised machine learning methods to enhance competency assessment. The novelty is in applying maritime-specific learning analytics for real-time performance segmentation, offering a scalable diagnostic framework for improving supervision quality and supporting individualized cadet development in vocational training contexts.
Stacking Ensemble with SMOTE for Robust Agricultural Commodity Price Prediction under Imbalanced Data Siagian, Yessica; Hutahaean, Jeperson; Mulyani, Neni
Journal of Applied Data Sciences Vol 6, No 4: December 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i4.916

Abstract

The volatility of agricultural commodity prices presents a substantial obstacle in the agribusiness sector, especially in supporting timely and data-driven decision-making. This volatility is primarily caused by the imbalanced distribution of historical price data and the complex, often nonlinear nature of price patterns. To address this challenge, this study proposes a novel predictive modeling approach by integrating Stacking Ensemble Learning and Synthetic Minority Over-sampling Technique (SMOTE). The dataset used in this research consists of 5,558 records and 9 features, sourced from a publicly available Kaggle dataset. The target variable daily price was transformed into three classes: low, medium, and high, using a quartile-based discretization approach to enable multiclass classification. The main objective is to evaluate whether stacking combined with SMOTE can improve model performance compared to baseline models that use individual algorithms. A total of eight models were constructed and compared: four baseline models using SMOTE only, and four stacking models integrating SMOTE. The experimental results demonstrate that the proposed model Decision Tree Regression with Stacking and SMOTE achieved the highest performance, with 98.68% accuracy, an F1-score of 0.9868, Cohen’s Kappa of 0.9803, MCC of 0.9803, ROC-AUC of 0.9995, and a log loss of 0.0529. Other optimized models also performed well, such as Random Forest (98.37% accuracy) and Gradient Boosting (98.56%). In contrast, baseline models such as Linear Regression and Decision Tree without stacking achieved only around 67–68% accuracy, with log loss exceeding 0.97. The key contribution of this study is the empirical evidence that combining stacking and SMOTE significantly enhances classification accuracy and model robustness in imbalanced datasets. The novelty lies in applying a deep learning-optimized stacking framework specifically for agricultural commodity price classification, along with a comprehensive multiclass evaluation, offering new insights for practical implementation in agricultural decision support systems.
Air Pollution Forecasting in Almaty using Ensemble Machine Learning Models Naizabayeva, Lyazat; Sembina, Gulbakyt; Aliman, Alibek; Satymbekov, Maxatbek; Barlykbay, Nazym; Seilova, Nurgul
Journal of Applied Data Sciences Vol 6, No 4: December 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i4.821

Abstract

This study develops an advanced forecasting methodology for air pollution levels in Almaty, Kazakhstan, focusing on fine Particulate Matter (PM2.5) and carbon monoxide concentrations. Air pollution poses significant risks to public health, and Almaty’s basin location exacerbates the problem. Addressing the limitations of traditional statistical forecasting methods, we propose an ensemble machine learning approach that integrates Seasonal-Trend decomposition with gradient boosting algorithms to capture complex temporal and nonlinear patterns. The objective is to develop and validate an effective methodology for forecasting atmospheric air pollution in Almaty using machine learning methods, in particular STL decomposition, XGBoost, LightGBM models, and their ensemble combination. The novelty lies in the integration of STL decomposition with an ensemble of gradient boosting models for high-accuracy air pollution forecasting in the complex urban environment of Almaty. The dataset includes hourly measurements from over 20 monitoring stations, enabling seasonal and spatial analysis. Rigorous preprocessing techniques were applied, including outlier removal, normalization, and time series decomposition into seasonal, trend, and residual components. Two gradient boosting models, XGBoost and LightGBM, were trained separately and combined into a weighted ensemble, with optimal weights determined through cross-validation. Figures and tables illustrate data preprocessing flow, model architectures, feature importance analysis, and evaluation of predictive performance. The ensemble outperformed individual models, achieving high accuracy with coefficient of determination values exceeding 0.98 for PM2.5 and 0.83 for carbon monoxide. The findings demonstrate that integrating Seasonal-Trend decomposition with ensemble learning provides a robust and effective approach to forecasting air pollution in complex urban environments. The methodology shows strong potential for practical application in real-time air quality monitoring and warning systems, aiding policymakers and public health authorities. Future research will expand the dataset by incorporating additional factors such as traffic flow, industrial emissions, and satellite remote sensing data to enhance predictive accuracy and model interpretability.