cover
Contact Name
Husni Teja Sukmana
Contact Email
husni@bright-journal.org
Phone
+62895422720524
Journal Mail Official
jads@bright-journal.org
Editorial Address
Gedung FST UIN Jakarta, Jl. Lkr. Kampus UIN, Cemp. Putih, Kec. Ciputat Tim., Kota Tangerang Selatan, Banten 15412
Location
Kota adm. jakarta pusat,
Dki jakarta
INDONESIA
Journal of Applied Data Sciences
Published by Bright Publisher
ISSN : -     EISSN : 27236471     DOI : doi.org/10.47738/jads
One of the current hot topics in science is data: how can datasets be used in scientific and scholarly research in a more reliable, citable and accountable way? Data is of paramount importance to scientific progress, yet most research data remains private. Enhancing the transparency of the processes applied to collect, treat and analyze data will help to render scientific research results reproducible and thus more accountable. The datasets itself should also be accessible to other researchers, so that research publications, dataset descriptions, and the actual datasets can be linked. The journal Data provides a forum to publish methodical papers on processes applied to data collection, treatment and analysis, as well as for data descriptors publishing descriptions of a linked dataset.
Articles 518 Documents
Incorporate Transformer-Based Models for Anomaly Detection Dewi, Deshinta Arrova; Singh, Harprith Kaur Rajinder; Periasamy, Jeyarani; Kurniawan, Tri Basuki; Henderi, Henderi; Hasibuan, M. Said; Nathan, Yogeswaran
Journal of Applied Data Sciences Vol 6, No 3: September 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i3.762

Abstract

This paper explores the effectiveness of Transformer-based models, specifically the Time-Series Transformer (TST) and Temporal Fusion Transformer (TFT), for anomaly detection in streaming data. We review related work on anomaly detection models, highlighting traditional methods' limitations in speed, accuracy, and scalability. While LSTM Autoencoders are known for their ability to capture temporal patterns, they suffer from high memory consumption and slower inference times. Though efficient in terms of memory usage, the Matrix Profile provides lower performance in detecting anomalies. To address these challenges, we propose using Transformer-based models, which leverage the self-attention mechanism to capture long-range dependencies in data, process sequences in parallel, and achieve superior performance in both accuracy and efficiency. Our experiments show that TFT outperforms the other models with an F1-score of 0.92 and a Precision-Recall AUC of 0.71, demonstrating significant improvements in anomaly detection. The TST model also shows competitive performance with an F1-score of 0.88 and Precision-Recall AUC of 0.68, offering a more efficient alternative to LSTMs. The results underscore that Transformer models, particularly TST and TFT, provide a robust solution for anomaly detection in real-time applications, offering improved performance, faster inference times, and lower memory usage than traditional models. In conclusion, Transformer-based models stand out as the most effective and scalable solution for large-scale, real-time anomaly detection in streaming time-series data, paving the way for their broader application across various industries. Future work will further focus on optimizing these models and exploring hybrid approaches to enhance detection capabilities and real-time performance.
Designing a Culturally Adaptive Information Framework for Anxiety Disorders: A Mixed-Methods Thematic Analysis in Malaysia Zailani, Achmad Udin; Wan Ahmad, Wan Nooraishya; Muh Tuah, Nooralisa; Tze Ping, Nicholas Pang Tze Ping Pang
Journal of Applied Data Sciences Vol 6, No 3: September 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i3.771

Abstract

This study addresses critical gaps in Malaysia's mental health landscape by developing a culturally adaptive framework for anxiety disorder resources, where only 28% of adults recognize symptoms due to cultural stigma and poor resource design. Our key contribution is a user-centered framework integrating visual-interactive tools with cultural adaptation strategies to improve accessibility and literacy. The objective was to investigate how information design can overcome barriers, using a mixed-methods approach with 12 anxiety disorder patients (screened via DASS-21). Findings revealed: (1) format preferences (infographics: 40%, videos: 35%, simulations: 25%), (2) accessibility barriers (technical language: 45%, lack of credible sources: 65%, insufficient examples: 30%), and (3) demand for demographic personalization (age-targeted content: 78%, mood-tracking tools: 62%). Quantitative results showed strong alignment between preferred formats and comprehension gains (infographics improved understanding by 40% vs. text). The novelty lies in merging cognitive load theory with Malay cultural values (familial collectivism, Islamic coping mechanisms) into actionable design principles. Our framework demonstrates that culturally tailored visual-interactive content increases engagement by 35-40% compared to generic materials, while simplified Malay Language reduces stigma-related avoidance by 28%. These ideas translate into three evidence-based strategies: (a) minimalist visual formats to reduce cognitive load, (b) family-involved examples to respect collectivism, and (c) hybrid delivery (online/offline) for rural accessibility. The study provides policymakers with metrics-backed guidance, showing SMS-based hybrid tools achieve 58% adherence in low-bandwidth areas versus 22% for chatbots. Future work should validate scalability in larger cohorts and test AR/VR adaptations (requested by 70% of youth participants). This research advances both mental health communication theory and practical interventions for Southeast Asia's multicultural contexts.
Detecting Gender-Based Violence Discourse Using Deep Learning: A CNN-LSTM Hybrid Model Approach Kurniawan, Tri Basuki; Dewi, Deshinta Arrova; Henderi, Henderi; Hasibuan, M. Said; Zakaria, Mohd Zaki; Ismail, Abdul Azim Bin
Journal of Applied Data Sciences Vol 6, No 3: September 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i3.761

Abstract

Gender-Based Violence (GBV) is a critical social issue impacting millions worldwide. Social media discussions offer valuable insights into public awareness, sentiment, and advocacy, yet manually analyzing such vast textual data is highly challenging. Traditional text classification methods often struggle with contextual understanding and multi-class categorization, making it difficult to accurately identify discussions on Sexual Violence, Physical Violence, and other topics. To address this, the present study proposes a hybrid deep learning approach combining Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM) networks. CNN is utilized for extracting key linguistic features, while LSTM enhances the classification process by maintaining sequential dependencies. This hybrid CNN+LSTM model is evaluated against standalone CNN and LSTM models to assess its performance in classifying GBV-related tweets. The dataset was sourced from Kaggle, containing real-world Twitter discussions on GBV. Experimental results demonstrate that the hybrid model surpasses both CNN and LSTM models, achieving an accuracy of 89.6%, precision of 88.4%, recall of 89.1%, and F1-score of 88.7%. Confusion matrix and ROC curve analyses further confirm the hybrid model’s superior performance, correctly identifying Sexual Violence (82%), Physical Violence (15%), and Other (3%) cases with reduced misclassification rates. These results suggest that combining CNN’s feature extraction with LSTM’s contextual learning provides a more balanced and effective classification model for GBV-related text. This work supports the development of AI-based tools for social media monitoring, policy-making, and advocacy, helping stakeholders better understand and respond to GBV discussions. Future research could explore transformer-based models like BERT and real-time classification applications to further improve performance.
Navigating Heart Stroke Terrain: A Cutting-Edge Feed-Forward Neural Network Expedition Praveen, S Phani; Mantena, Jeevana Sujitha; Sirisha, Uddagiri; Dewi, Deshinta Arrova; Kurniawan, Tri Basuki; Onn, Choo Wou; Yorman, Yorman
Journal of Applied Data Sciences Vol 6, No 3: September 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i3.763

Abstract

Heart stroke remains one of the leading causes of death worldwide, necessitating early and accurate prediction systems to enable timely medical intervention. While a variety of machine learning approaches have been employed to address this issue, including Logistic Regression, Decision Trees, Random Forests, Support Vector Machines, and K-Nearest Neighbors, these models often suffer from limitations such as overfitting, insufficient generalization, poor performance on imbalanced datasets, and inability to capture complex nonlinear patterns in clinical data. Additionally, many existing works do not comprehensively integrate both clinical and demographic features or lack rigorous evaluation metrics beyond accuracy alone. This study proposes a novel Feed-Forward Neural Network (FFNN) model for heart stroke prediction, designed to overcome the shortcomings of conventional models. Unlike shallow classifiers, the FFNN architecture employed here leverages multiple hidden layers and nonlinear activation functions to learn intricate relationships within the dataset. The dataset used comprises various attributes such as age, hypertension, heart disease, BMI, and smoking status, which were preprocessed through normalization, one-hot encoding, and imputation techniques to ensure data quality and model performance. Experiments were conducted using a stratified train-test split, and the model was trained using the Adam optimizer with carefully tuned hyperparameters. Comparative evaluations against baseline models (Logistic Regression, Random Forest, and SVM) were carried out using precision, recall, F1-score, and ROC-AUC as performance metrics. The proposed FFNN achieved the highest accuracy of 96.47%, along with substantial improvements in recall and F1-score, highlighting its superior capability in identifying potential stroke cases even in imbalanced datasets. This work bridges a significant gap in heart stroke prediction by demonstrating the effectiveness of deep learning models—specifically FFNNs—in extracting complex patterns from diverse patient data. It also sets the stage for further exploration of deep learning-based clinical decision support systems.
A Dual-Fusion Hybrid Model with Attention for Stunting Prediction among Children under Five Years Hadikurniawati, Wiwien; Hartomo, Kristoko Dwi; Sembiring, Irwan; Arthur, Christian
Journal of Applied Data Sciences Vol 6, No 3: September 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i3.831

Abstract

Malnutrition remains a persistent global health challenge, especially among children under five. Traditional assessment methods often rely on static anthropometric measures, which are limited in capturing complex growth patterns. This study aims to develop a robust classification model for predicting the nutritional status of children under five years old, addressing the critical public health challenge of stunting. The model contributes to the growing need for accurate, data-driven early detection systems in child health monitoring by introducing a hybrid framework that combines deep learning and classical machine learning techniques. The proposed approach integrates automatically extracted features from a One-Dimensional Convolutional Neural Network (1D-CNN) with classical anthropometric indicators. These combined features are processed through an additive attention mechanism, highlighting the most informative attributes. The attention-weighted representation is then classified using an ensemble stacking method that aggregates predictions from multiple base classifiers, including decision trees, nearest neighbor algorithms, support vector machines, etc. Synthetic Minority Over-sampling Technique (SMOTE) is applied to the training dataset to mitigate data imbalance, particularly the underrepresentation of severe and moderate malnutrition cases. The research utilizes a dataset comprising 2,789 records of children under five years old collected from community health posts in Indonesia. Data preprocessing included cleaning, normalization, and gender encoding. The model’s performance was evaluated using 5-fold cross-validation and measured by accuracy, precision, recall, and area under the curve metrics. The results show that the proposed model achieved an average accuracy of 99.70% and an area under the curve of 99.99%. An ablation study further demonstrated the significant contribution of each component, feature extraction, fusion mechanism, and ensemble classifier to the final performance. This approach reveals a robust and scalable solution for early nutritional status prediction in healthcare settings.
Integrating Moving Average Indicators with Long Short-Term Memory Model in Bitcoin Price Forecasting Quang, Phung Duy; Duy, Nguyen Hoang; Khoai, Pham Quang; Duong, Bui Duc
Journal of Applied Data Sciences Vol 6, No 3: September 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i3.766

Abstract

Bitcoin price forecasting remains a challenging task due to the market's high volatility and complex nonlinear dynamics. This study proposes a novel forecasting framework by integrating Long Short-Term Memory (LSTM) networks with Moving Average (MA) indicators—specifically Simple Moving Average (SMA), Exponential Moving Average (EMA), and Weighted Moving Average (WMA)—as auxiliary input features to enhance model accuracy. The objective is to examine the frequency-specific effectiveness of these hybrid models across daily and high-frequency datasets. Using historical Bitcoin data from Bitstamp between January 2021 and December 2024, we conducted experiments at four epoch levels (50, 100, 150, 200) to determine optimal model configurations. Empirical results reveal that, on daily data, LSTM combined with a 10-period WMA achieves the lowest Mean Absolute Percentage Error (MAPE) of 2.1661% at 150 epochs, while for high-frequency data, the combination with a 10-period SMA yields superior performance with a MAPE of 0.4895%. Furthermore, increasing epochs beyond the optimal point led to performance degradation, indicating overfitting. Compared to the standalone LSTM model, our integrated approach demonstrates significantly improved adaptability to short-term fluctuations and heightened forecasting precision. This research contributes a comprehensive comparative analysis of MA-enhanced deep learning models for cryptocurrency price prediction, and offers practical insights for algorithmic traders, financial analysts, and decision-support systems in volatile digital asset markets.
Enhancing Aspect-Based Sentiment Analysis in Tourism Reviews Through Hybrid Data Augmentation Iswari, Ni Made Satvika; Afriliana, Nunik
Journal of Applied Data Sciences Vol 6, No 3: September 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i3.842

Abstract

The increasing reliance on online reviews in tourism has made User-Generated Content (UGC) an invaluable resource for understanding visitor perceptions. However, extracting meaningful insights from these reviews remains challenging due to their unstructured nature, aspect imbalance, and the prevalence of code-mixing between languages such as Indonesian and English—particularly in multicultural destinations like Bali. Aspect-Based Sentiment Analysis (ABSA) offers a promising solution by associating sentiment polarity with specific aspects of tourist experiences. Yet, its performance is often constrained by limited and imbalanced datasets, especially for underrepresented aspects such as sanitation and amenities. This study proposes a hybrid data augmentation framework that integrates three complementary strategies: generative augmentation using ChatGPT, semantic filtering via Sentence-BERT (SBERT), and domain refinement through Masked Language Modeling (MLM). The framework is designed to improve ABSA performance on multilingual tourism reviews by generating synthetic aspect-relevant data while preserving semantic integrity and contextual nuance. Using 398 reviews of Kuta Beach in Bali, we evaluate the effectiveness of the proposed approach across five tourism aspects: scenery, dusk, surf, amenities, and sanitation. Results show that the hybrid strategy reduces hallucination rates from 12% (using ChatGPT alone) to 3.8%, increases F1-scores for underrepresented aspects by up to 5.1%, and improves cross-lingual alignment (Cohen’s κ = 0.78). These improvements demonstrate the synergy between generative and semantic augmentation in addressing real-world ABSA challenges. The proposed method not only advances the state of multilingual ABSA but also offers practical implications for tourism analytics, allowing destination managers to better understand and respond to aspect-specific visitor feedback. The framework is extensible to other low-resource domains, were linguistic diversity and data scarcity present similar limitations.
An IoT-Enabled Smart System Utilizing Linear Regression for Sheep Growth and Health Monitoring Efendi, Syahril; Sihombing, Poltak; Mawengkang, Herman; Turnip, Arjon; Weber, Gerhard Wilhelm
Journal of Applied Data Sciences Vol 6, No 3: September 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i3.901

Abstract

The global livestock industry faces significant pressures from climate change, land constraints, and rising consumer demand, necessitating greater efficiency and sustainability in production. To address these challenges, there is a critical need for accessible, data-driven tools; however, accessible and individualized tools for monitoring the growth and health of livestock like sheep remain underdeveloped, limiting farmers' ability to transition from reactive to proactive management. This study developed and validated an Internet of Things (IoT) smart system for monitoring sheep using an Arduino and ESP32 platform equipped with a DHT22 sensor for temperature and humidity and a load cell for weight. Weekly weight data from 15 sheep were collected over a six-month period. Simple linear regression was then applied to model the individual growth trajectory of each animal. The IoT system was successfully implemented and deployed in a farm setting. The primary finding was that individualized linear regression models provided a highly accurate method for tracking sheep growth, with R² values consistently exceeding 99% for most animals. The system effectively delivered real-time reports on growth trajectories and health-relevant environmental conditions (e.g., temperature and humidity) to a smartphone interface, confirming its practical utility. The primary implication of this research is a validated framework for practical and interpretable precision livestock farming. The system empowers farmers to shift from reactive to proactive management by using individualized growth curves as baselines for early problem detection. This dual-function system enhances productivity through precise growth tracking while supporting animal welfare via environmental monitoring, offering a valuable tool for modern, sustainable sheep farming.
The Impact of Supplier-Customer Collaboration on Sustainability-Oriented Capability: The Mediating Role of Communication Effectiveness and Technology Adoption in Micro-Enterprises Gronphet, Seree; Onputtha, Suraporn
Journal of Applied Data Sciences Vol 6, No 3: September 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i3.725

Abstract

With many businesses transforming digitally to lead future trends driven by sustainability, supplier-customer collaboration is key in empowering MSMEs. Nevertheless, effective collaborative practices remain challenging for MSMEs due to resource constraints and operational limitations. This study examines the mediating role of communication effectiveness and technology adoption in the relationship between supplier-customer collaboration and sustainability capabilities. Data was collected from 400 MSMEs in Bangkok and metropolitan area, Thailand, and analyzed using partial least squares-structural equation modeling (PLS-SEM). Results indicate that supplier-customer collaboration positively influences sustainability capability directly (β=0.345, p 0.001) and indirectly through communication effectiveness (β=0.370, p 0.001) and technology adoption (β=0.043, p 0.001). Effectiveness of communication appeared to be the superior mediator, providing evidence that the impact of technology adoption is less influential in resource-constrained environments. The study’s novelty lies in its focus on micro-enterprises’ unique constraints and the creation of a contextualized framework that prioritizes effective communication over technological investments, challenging conventional sustainability models derived from large organizations and offering context-specific policy recommendations for enhancing micro-enterprise sustainability in developing economies.
An Approach for Emotion Detection in Natural Arabic Audio Files Based on Acoustic and Lexical Features Kaloub, Ashraf; Elgabar, Eltyeb Abed
Journal of Applied Data Sciences Vol 6, No 3: September 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i3.617

Abstract

Emotion Detection is a crucial for enhancing human-machine interactions. This paper addresses the challenge of accurately recognizing emotional states from speech, particularly in distinguishing between emotions with similar acoustic characteristics, such as anger, happiness and surprise, which have high pitch and energy. While acoustic features convey significant information about emotional states, they are often inadequate for distinguishing between these emotions. This limitation highlights the need for improved performance in emotion detection systems. The main contribution of this work is the introduction of a multimodal approach that combines both acoustic and lexical features for emotion detection in natural Arabic audio files, focusing on four emotions anger, happiness, sadness and neutral. To the best of our knowledge, this is the first study that employ such a combination in this context, building on our previous work that utilized only acoustic features. Several Machine Learning (ML) classifiers were applied including Sequential Minimal Optimization (SMO), Random Forest (RF), K-Nearest Neighbors (KNN), and Simple Logistic (SL). Two types of experiments were executed: one using only lexical features and another combining various acoustic features sets with lexical features. This approach enhances our previous experiments that used only acoustic features. The experimental results show that SMO classifier achieved the highest performance, with an accuracy 96.11% when using all acoustic features combined with a unigram model, outperforming the other classifiers. These results suggest that combining acoustic and lexical features enhances the performance of emotion detection models, particularly for complex emotions in natural Arabic audio datasets.