cover
Contact Name
Husni Teja Sukmana
Contact Email
husni@bright-journal.org
Phone
+62895422720524
Journal Mail Official
jads@bright-journal.org
Editorial Address
Gedung FST UIN Jakarta, Jl. Lkr. Kampus UIN, Cemp. Putih, Kec. Ciputat Tim., Kota Tangerang Selatan, Banten 15412
Location
Kota adm. jakarta pusat,
Dki jakarta
INDONESIA
Journal of Applied Data Sciences
Published by Bright Publisher
ISSN : -     EISSN : 27236471     DOI : doi.org/10.47738/jads
One of the current hot topics in science is data: how can datasets be used in scientific and scholarly research in a more reliable, citable and accountable way? Data is of paramount importance to scientific progress, yet most research data remains private. Enhancing the transparency of the processes applied to collect, treat and analyze data will help to render scientific research results reproducible and thus more accountable. The datasets itself should also be accessible to other researchers, so that research publications, dataset descriptions, and the actual datasets can be linked. The journal Data provides a forum to publish methodical papers on processes applied to data collection, treatment and analysis, as well as for data descriptors publishing descriptions of a linked dataset.
Articles 518 Documents
Transformer Architectures for Automated Brain Stroke Screening from MRI Images Abstract Sukmana, Husni Teja; Hasibuan, Zainal Arifin; Rahman, Abdul Wahab Abdul; Bayuaji, Luhur; Masruroh, Siti Ummi
Journal of Applied Data Sciences Vol 6, No 2: MAY 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i2.736

Abstract

Early and accurate detection of stroke is critical for timely medical intervention and improved patient outcomes. This study explores the application of deep learning models, particularly the Vision Transformer (ViT), for the automated classification of brain stroke from medical images. A curated dataset of brain scans was used to train and evaluate the ViT model, which was benchmarked against a widely used convolutional neural network (CNN), ResNet18. Both models were trained using transfer learning techniques under identical preprocessing and training configurations to ensure fair comparison. The results indicate that the ViT model significantly outperforms ResNet18 in terms of validation accuracy, class-wise precision, and recall, achieving a peak accuracy of 99.60%. Visual analyses, including confusion matrices and sample prediction comparisons, reveal that ViT is more robust in detecting subtle stroke patterns. However, ViT requires more computational resources, which may limit its deployment in real-time or low-resource settings. These findings suggest that transformer-based architectures are highly effective for medical image classification tasks, particularly in stroke diagnosis, and offer a viable alternative to traditional CNN-based approaches.
Unveiling Hybrid Model with Naive Bayes, Deep Learning, Logistic Regression for Predicting Customer Churn and Boost Retention Subramanian, Devibala; Ajitha, Ajitha; Maidin, Siti Sarah
Journal of Applied Data Sciences Vol 6, No 2: MAY 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i2.675

Abstract

The telecommunications sector is rapidly evolving but is increasingly challenged by customer churn, where subscribers switch to competing service providers. This study introduces a hybrid model for churn prediction and customer retention by combining machine learning methods—Naive Bayes, Deep Learning, and Logistic Regression—with sentiment analysis on user-generated content (UGC). Data was gathered through two primary sources: survey responses and 352 social media comments from users aged 20–35. The survey data was enriched with features such as gender, age, subscription period, complaints, and retention efforts. The preprocessing steps included handling missing values, scaling features, and encoding categorical variables to ensure model robustness. Experimental results demonstrated that Logistic Regression achieved the highest accuracy (88.45%) and sensitivity (91.33%) in detecting potential churners. The PCA-based approach followed closely with an accuracy of 86.77% and a balanced sensitivity-specificity profile (89.95% and 83.58%, respectively), effectively capturing key churn indicators. Random Forest and Decision Tree classifiers yielded lower sensitivity but remained strong in specificity, indicating their suitability for identifying loyal customers. Attribute weight analysis across models revealed that subscription plan, age, and retention effort were consistently influential in churn prediction. Furthermore, the integration of sentiment analysis provided emotional context to churn behavior, with negative comments triggering alerts for proactive engagement. The study highlights the predictive strength of combining structured survey data and unstructured UGC through machine learning and sentiment analytics. It underscores the importance of personalized retention strategies based on model interpretability and correlation weight findings. This hybrid approach equips telecom companies with actionable insights to minimize churn and sustain customer loyalty in a competitive market.
Cross-Biome Biodiversity Assessment and Anomaly Detection Using AI-Enhanced Acoustic Monitoring Radif, Mustafa; Fadhil, Shumoos Aziz; Alrammahi, Atheer Hadi
Journal of Applied Data Sciences Vol 6, No 3: September 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i3.741

Abstract

This study proposes a novel AI-powered eco-monitoring framework that integrates acoustic ecology, deep learning, and low-cost IoT devices to enable scalable, real-time biodiversity assessment and ecological anomaly detection across diverse environments. The primary objective is to automate species classification and environmental monitoring using passive audio data captured by solar-powered IoT sensors, thereby reducing reliance on manual ecological surveys. The framework comprises four modules: acoustic data acquisition, dual-representation preprocessing Short-Time Fourier Transform (STFT) and Mel-Frequency Cepstral Coefficients (MFCCs), species classification using CNN and CNN-LSTM models, and anomaly detection via autoencoders and one-class SVM. Field validation and multi-dataset testing were conducted across 250+ species from temperate forests, wetlands, and urban areas. The CNN-LSTM model achieved the highest performance with 93.7% accuracy, 93.0% precision, and a 92.5% F1-score, while anomaly detection reached 89.7% precision with an AUC of 0.94, effectively identifying irregularities such as invasive calls, mechanical noise, and species absence. A forest case study demonstrated the system’s ability to detect circadian acoustic patterns (e.g., dawn chorus of sparrows, nocturnal owl calls), and real-world disturbances with 91% expert validation agreement. The novelty of this work lies in its hybrid AI architecture with real-time unsupervised anomaly detection, cross-biome generalization capability, and deployment readiness on low-power edge devices like Raspberry Pi and Jetson Nano. Inference times as low as 18 ms per sample and bandwidth usage under 3 MB/hour make it feasible for continuous, remote deployment. The framework offers a robust and adaptable solution for conservation efforts, environmental policy, and climate resilience initiatives. Future directions include integrating multimodal data sources and transformer-based continual learning for broader ecological impact. These findings position the system as a scalable and intelligent tool for next-generation, AI-driven environmental monitoring.
Dispute on Security Framework Model of MFCC Mixed Methods in Speech Recognition System  Pratiwi, Heni Ispur; Kartowisastro, Iman Herwidiana; Soewito, Benfano; Budiharto, Widodo
Journal of Applied Data Sciences Vol 6, No 3: September 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i3.689

Abstract

An audio recording system device has unprecedented activities of its authorized users which in a particular way cause vulnerability to the system. It starts to get into a fuzzy condition and deteriorate the system sensitivity in detecting unauthorized access to pass through, then the system inclination may occur. One case is when separate users picked speech voices with similar keywords to set their usernames or password. Moreover, when users are siblings or twins that could have merely similar voices.  Troublesome of this situation leads to a less sensitive manner of a security system, and in some situations, the system could operate blocking authorized users themselves to get access. This paper defines a proposed method to resolve the situation by combining Mel Frequency Cepstral Coefficient with other methodologies, which have been implemented for many other research’ specific objectives as well. This paper displays to prove its combination with an interval scoring in Fuzzy Relation complements a resolution to tackle the security of fuzzy issues mentioned. The Mel Scale has its capacity of delivering extractions output from audio input data, it is called as spectral centroids which refer to humans’ voices or an individual's voice features. Some spectral centroids get merely similar results due to those inclinations mentioned. This paper exposes Fuzzy Relation method to fit the need of verification procedures thorough its interval scale on any fuzzy features. The objective of verification procedure is to gain consistency measured scales, and security warrant remains valid. The inhouse experiments served to give user A of [0.49, 1.18] interval, user B of [0.76,1.07] interval, and user C of [0.44,0.95] interval, and those interval numbers are proposed to cap other login users accounts unto theirs.
Human Capital and Sustainable Teacher Performance: Examining the Impact of Servant Leadership, Competence, and Professional Commitment in Catholic Education Budiyanto, Hendro; Djati, Sundring Pantja; Alirejo, Mohamad Subroto; Rini, Wahju Astjarjo
Journal of Applied Data Sciences Vol 6, No 2: MAY 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i2.638

Abstract

This study examines the impact of servant leadership on the performance of Catholic religious teachers, with competence and professional commitment as mediating variables. Using Partial Least Squares Structural Equation Modeling (PLS-SEM), data were collected from 151 Catholic religious teachers in the Jakarta Archdiocese. The results show that servant leadership has a direct positive impact on teacher performance (β = 0.317, p 0.001) and indirectly enhances performance through competence (β = 0.199, p = 0.008) and professional commitment (β = 0.186, p = 0.002). Competence (β = 0.357, p = 0.001) and professional commitment (β = 0.340, p = 0.002) significantly improve teacher performance. The structural model explains 74.9% of the variance in teacher performance, indicating strong predictive power. This study contributes to the literature by demonstrating the mediating role of competence and professional commitment in the relationship between servant leadership and performance, particularly in Catholic education. The findings provide practical implications for school administrators and policymakers to implement servant leadership strategies that enhance teacher competence and commitment. This research introduces a comprehensive approach to improving teacher effectiveness in religious education settings, emphasizing the importance of leadership styles that prioritize service, empowerment, and professional development.
HOG feature extraction in optimizing FK-NN and CNN for image identification of rice plant diseases Gama, Adie Wahyudi Oktavia; Gunawan, Putu Vina Junia Antarista; Darmaastawan, Kadek
Journal of Applied Data Sciences Vol 6, No 3: September 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i3.722

Abstract

This study compares the performance of FK-NN and CNN models in identifying rice diseases from digital images, focusing on both effectiveness and efficiency. Additionally, this research utilizes HOG for feature extraction from the digital images. The stages include data collection, preprocessing, transformation, classification, and model evaluation. The results show that the FK-NN model achieves a higher accuracy of 86.26%, compared to the CNN model's accuracy of 71.37%. Furthermore, the precision value of the FK-NN model is also higher at 86.88%, compared to the CNN model’s precision of 72.74%. Similarly, the recall value for the FK-NN model is higher at 86.88%, compared to the CNN model’s 71.37%. The F1-score of the FK-NN model is likewise superior, with a value of 86.88%, compared to the CNN model’s F1-score of 71.37%. These findings suggest that the FK-NN model with HOG feature extraction is more effective. However, in terms of inference time, the CNN model is faster, taking 0.000282 seconds compared to FK-NN’s 0.002331 seconds. In conclusion, the FK-NN model with HOG feature extraction excels in identifying rice diseases, while the CNN model offers faster inference time in this study.
Sentiment and Emotion Classification Model Using Hybrid Textual and Numerical Features: A Case Study of Mental Health Counseling Ramayanti, Indri; Hermawan, Latius; Syakurah, Rizma Adlia; Stiawan, Deris; Meilinda, Meilinda; Negara, Edi Surya; Fahmi, Muhammad; Ghiffari, Ahmad; Rizqie, Muhammad Qurhanul
Journal of Applied Data Sciences Vol 6, No 3: September 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i3.632

Abstract

Mental health issues among individuals, particularly in counseling contexts, require practical tools to understand and address emotional states. This study explores the application of machine learning models for emotion detection in mental health counseling conversations, focusing on four algorithms: Bernoulli Naive Bayes, Decision Tree, Logistic Regression, and Random Forest. The dataset, derived from transcribed counseling sessions, underwent preprocessing, including stemming, stopword removal, and TF-IDF vectorization to create structured inputs for classification. Emotional categories such as "Depresi" (Depression), "Kecewa" (Dissapointed), "Senang" (Happy), "Bingung" (Confused) and "Stres" (Stress) were analyzed to evaluate model performance. Results indicated that Logistic Regression achieved the highest accuracy at 82%, showcasing its reliability and scalability, followed closely by Random Forest with 81%, demonstrating robustness in handling complex data structures. Bernoulli Naive Bayes performed competitively at 80%, excelling in computational efficiency, while Decision Tree recorded the lowest accuracy at 70%, reflecting its limitations in managing overlapping features and high-dimensional data. These findings highlight the potential of machine learning in addressing the increasing demand for scalable mental health support tools. The study underscores the importance of model selection, balanced datasets, and feature engineering to improve classification accuracy. Future work includes developing AI-driven chatbots for real-time emotion detection and integrating multimodal data to enhance interpretability. This research contributes to advancing automated solutions for mental health care, offering new pathways for timely and personalized interventions.
Factors Affecting the Intention to Buy Electric Vehicles Through the Integration of Technology Acceptance Model and Prior Experience Saleh, Hendra Noor; Maupa, Haris; Cokki, Cokki; Sadat, Andi Muhammad
Journal of Applied Data Sciences Vol 6, No 2: MAY 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i2.730

Abstract

To enhance the adoption of electric vehicles (EVs), governments have implemented regulatory policies, such as providing incentives. However, this approach is temporary and relies on the active involvement of manufacturers to better understand the driving factors behind EV adoption. While previous studies, largely based on behavioral theory, emphasize psychological and environmental factors, individual subjective factors also play a crucial role. This study introduces a novel approach by integrating variables from the Technology Acceptance Model (TAM)—perceived usefulness and perceived ease of use—with consumer experience variables, namely technology discomfort and customer experience. The goal is to improve TAM's explanatory power regarding the intention to buy EVs from the consumer perspective. The research targeted residents of Jabodetabek (Jakarta, Bogor, Depok, Tangerang, Bekasi) aged 17 and older, all of whom had prior experience with Battery Electric Vehicles (BEVs). Data was collected from 330 respondents through an online survey. Structural Equation Modeling (SEM) with AMOS was used for the analysis. The results indicated that perceived usefulness, perceived ease of use, and customer experience significantly influenced intention to buy, while perceived usefulness did not significantly affect customer experience. Customer experience mediated the relationship between perceived ease of use and intention to buy, but did not mediate the effect of perceived usefulness. Additionally, technology discomfort negatively impacted perceived usefulness and ease of use, although it did not significantly affect customer experience. These findings suggest that while government incentives remain important, a market-driven approach that focuses on improving consumer perceptions and experiences is critical for accelerating EV adoption.
A Study to Detect Multi-word Expression from Text Using Deep Learning Models Jun Meng, Wong; Yu Jie, Tan; Tong Ming, Lim
Journal of Applied Data Sciences Vol 6, No 3: September 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i3.716

Abstract

Detecting Multi-word Expressions (MWEs) is a crucial task in Natural Language Processing (NLP) for applications in machine translation, sentiment analysis, and information retrieval. This study evaluates the performance of several deep learning models on MWE detection using two samples of varying sizes from the major consumer electronic product retailer corpus. The sample is limited to 10,000 and 15,000 rows, with each row contains 15-20 English words. Preprocessing steps include removing special symbols and emojis, converting text to lowercase, and applying the spaCy NLP library for tokenization and part-of-speech (POS) tagging. Syntactic rules are then used to identify MWEs such as verb-noun combinations and phrasal verbs, with BIO tags (B-MWE, I-MWE, O) to mark MWE boundaries. We investigated transformer-based models such as BERT, BERT-CRF, LSTM-CRF and RoBERTa-CRF using a sample of 10,000 rows; BERT, BERT-BiLSTM, BiLSTM-GloVe, and BiLSTM-GloVe-BiGRU uses a sample of 15,000. Results demonstrated that the transformer-based model, RoBERTa-CRF, excels on the smaller sample which achieves the best performance by leveraging the contextual embeddings and sequential dependency modeling. On a larger sample, the BERT-BiLSTM model emerged as the most effective model, showcasing the advantage of combining dynamic embeddings with sequential learning. In contrast, models utilizing static embeddings, such as GloVe, displayed moderate performance, highlighting their limitations in capturing contextual nuances. Comparative analysis across both samples reveals that transformer-based models like RoBERTa-CRF performed optimally on the smaller dataset, whereas hybrid models integrating with sequential architectures like BERT-BiLSTM demonstrated superior performance as dataset size increased. These findings highlight the importance of model selection based on dataset scale to optimize MWE detection. This study underscores the importance of integrating contextual and sequential deep learning techniques to improve MWE detection and provides a basis for developing more robust and scalable systems for diverse linguistic tasks.
Implementation of Naïve Bayes Gaussian Algorithm for Real-Time Classification of Broiler Cage Conditions Rosmasari, Rosmasari; Prafanto, Anton; Firdaus, Muhammad Bambang
Journal of Applied Data Sciences Vol 6, No 3: September 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i3.694

Abstract

Monitoring large-scale broiler farms poses considerable challenges due to the variable nature of environmental conditions, which have a direct impact on poultry health and productivity. This study proposes a real-time classification system for broiler house conditions, utilizing the Naïve Bayes Gaussian algorithm in conjunction with the Internet of Things (IoT) technology. The system has been developed to address the limitations of manual monitoring by automating the collection of temperature, humidity, and ammonia data through BME-680 and MICS-5524 sensors, which are strategically positioned 30 cm from the floor to optimize ammonia detection. Utilizing a dataset comprising 865 records, meticulously divided into 75% for training (648 records) and 25% for testing (217 records), the model attained an accuracy of 82.03%, a precision of 75.67%, a recall of 82.67%, and an F1-score of 77.67%. A comparative analysis was conducted, which demonstrated significant advantages over alternative classification methods, with Decision Trees achieving 79.5% accuracy and Support Vector Machines reaching 80.8%. The innovation lies in the integration of automated condition classification into an IoT system, enabling rapid responses to environmental changes with processing times of approximately 500 milliseconds from sensing to classification. The system demonstrated an accuracy of 178 data points, with a misclassification rate of 39 out of 217 test samples. The strategic placement of sensors at a height of 30 cm optimizes ammonia detection while ensuring accurate temperature and humidity readings. The system provides historical data, enabling farms to analyze long-term environmental trends, and thereby support data-driven decision-making strategies to enhance broiler welfare and operational efficiency. Usability testing with five poultry farm operators confirmed the dashboard's intuitive design, though recommendations for visual alerts for critical ammonia levels were suggested for future iterations.