cover
Contact Name
Husni Teja Sukmana
Contact Email
husni@bright-journal.org
Phone
+62895422720524
Journal Mail Official
jads@bright-journal.org
Editorial Address
Gedung FST UIN Jakarta, Jl. Lkr. Kampus UIN, Cemp. Putih, Kec. Ciputat Tim., Kota Tangerang Selatan, Banten 15412
Location
Kota adm. jakarta pusat,
Dki jakarta
INDONESIA
Journal of Applied Data Sciences
Published by Bright Publisher
ISSN : -     EISSN : 27236471     DOI : doi.org/10.47738/jads
One of the current hot topics in science is data: how can datasets be used in scientific and scholarly research in a more reliable, citable and accountable way? Data is of paramount importance to scientific progress, yet most research data remains private. Enhancing the transparency of the processes applied to collect, treat and analyze data will help to render scientific research results reproducible and thus more accountable. The datasets itself should also be accessible to other researchers, so that research publications, dataset descriptions, and the actual datasets can be linked. The journal Data provides a forum to publish methodical papers on processes applied to data collection, treatment and analysis, as well as for data descriptors publishing descriptions of a linked dataset.
Articles 55 Documents
Search results for , issue "Vol 6, No 3: September 2025" : 55 Documents clear
The Influence of Logistics Technology Innovation on the Efficiency of Operations in Small and Medium-Sized Businesses in Thailand Inmor, Sureerut; Rangsom, Kritiya; Šírová, Eva; Wongpun, Sukontip
Journal of Applied Data Sciences Vol 6, No 3: September 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i3.684

Abstract

Logistics technology innovation include technology for moving materials and products, such as robotics and automated logistics systems; technology used to transmit information that enables real-time data exchange to optimize material movement; and technology to assist in decision-making as artificial intelligence enhances decision-making. These technologies include the use of digital transformation, automation, and enhanced decision-making tools to increase the efficiency of supply chain operations. This study aimed to examine how environmental factors (legal regulations, market competition, and stakeholder involvement) influence the operational efficiency of small and medium-sized enterprises in Thailand, with logistics technology innovation serving as a mediating factor, and to propose strategic guidelines for improving business performance through innovation. Data were collected from 400 small and medium-sized businesses in the Eastern Special Development Zone which are Chachoengsao, Chonburi, and Rayong provinces. A purposive sampling method was used to select enterprises in logistics-related industries, followed by convenience sampling for survey distribution. The investigation was carried out utilizing structural equation modeling. The findings revealed that environmental variables have a considerable impact on operational efficiency, with logistics technology innovation serving as a mediating variable. The direct effect of environmental factors on innovation technology was strong (β = 0.73), while innovation technology had a significant positive effect on operational efficiency (β = 0.37). Product movement technologies, including robots and automated vehicles, had the greatest influence (β = 0.62), followed by digital data transmission technologies (β = 0.34) and decision support systems (β = 0.06). These results imply that small and medium-sized businesses should emphasize logistics automation, artificial intelligence-driven decision-making, and digital data sharing platforms to increase efficiency. This study offers important insights for corporate executives and politicians in creating a favorable climate.
A New Data Preprocessing Framework to Enhance the Accuracy of Herbal Plants Classification Using Deep Learning Kunlerd, Attapol; Ritthiron, Atipat; Nabumroong, Boonlueo; Luangmaneerote, Sakchan; Chaiwachirakhampon, Anyawee; Kaewyotha, Jakkrit
Journal of Applied Data Sciences Vol 6, No 3: September 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i3.733

Abstract

This research proposes to solve the problem of herbal plant classification, which plays a key role in Thai pharmacy and traditional medicine. Moreover, there are limitations due to similar physical characteristics of plants and the reliance on specialists to classify herbal plants, which hinder the utilization of herbal plants by the general public at the local level. To solve this problem, this research presents a new preprocessing framework called P4, which integrates 7 techniques as follow: Image Cropping, Resizing, Normalization (0–1), Data Augmentation, Label Noise, Label Cleaning, and Dataset Quality Score (DQS). The prominent point of P4 technique is the combination of intentional mislabeling and label cleaning process, as well as, quantitative data quality assessment and additional expert review in order to filter out potentially inaccurate data before inputting to Deep Learning model. In the experiment, a dataset of 4,211 herbal images covering 30 herbal plant species is used and compared with 3 proposed techniques in previous research (P1–P3) with 5 deep learning architectures, namely DenseNet201, EfficientNetB7, ViT, Swin Transformer, and ConvNeXt. The experimental results showed that the P4 technique combined with DenseNet201 model provided the highest performance in herbal plant classification, with an Accuracy of 92%, Precision of 92%, Recall of 91%, and a training time of merely 22.92 minutes. This was a result of combining the good data quality from the P4 technique, which enhanced to increase efficiency in producing higher quality and more balanced data. When combined with the structural capability of DenseNet201 that supported feature reuse from previous layers, it increased the robustness to mislabeled data and was able to accurately distinguish plants with similar characteristics. The results of this experiment are able be applied as a guideline for future application in Thai traditional medicine support system and herbal plant learning system.
Modelling and Investigation of Solar Photovoltaic-Based Converter Configurations with Data Science Approach S., Prakash; S., Lakshmi; S., Priya; Batumalay, Malathy
Journal of Applied Data Sciences Vol 6, No 3: September 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i3.715

Abstract

Renewable energy sources, such as solar photovoltaic (PV) systems, typically produce low-voltage outputs, necessitating the use of high-gain direct current (DC) converters for efficient energy conversion. This study proposes a high-gain DC-DC converter for PV applications, designed with two MOSFET switches, two inductors, and two capacitors, offering a compact and efficient configuration. The converter achieves a high voltage gain of 6.8 and maintains a conversion efficiency of 97.7%, making it suitable for high-power applications. A data science-driven approach was employed to analyze the converter’s performance, integrating conventional simulation with machine learning techniques. Simulation results, conducted using MATLAB, confirmed the converter's superior performance, achieving an input ripple of 0.05% and an output ripple of 0.01%. Machine learning models, including Linear Regression, Decision Tree, Ridge Regression, and Support Vector Machine (SVM), provided deeper insights into the converter's behavior. Linear Regression accurately predicted output voltage, Ridge Regression minimized overfitting, and the Decision Tree model identified Duty Ratio and Input Voltage as the most critical factors affecting efficiency. SVM effectively classified operating conditions into high, moderate, and low efficiency. The Zero-Voltage Switching (ZVS) technique minimized switching losses, enhancing overall efficiency. This study demonstrates that integrating data science techniques with conventional analysis enhances the understanding and optimization of high-gain converters. The proposed converter provides a scalable and efficient solution for PV applications, offering insights for further optimization as part of process innovation.
Cross-Biome Biodiversity Assessment and Anomaly Detection Using AI-Enhanced Acoustic Monitoring Radif, Mustafa; Fadhil, Shumoos Aziz; Alrammahi, Atheer Hadi
Journal of Applied Data Sciences Vol 6, No 3: September 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i3.741

Abstract

This study proposes a novel AI-powered eco-monitoring framework that integrates acoustic ecology, deep learning, and low-cost IoT devices to enable scalable, real-time biodiversity assessment and ecological anomaly detection across diverse environments. The primary objective is to automate species classification and environmental monitoring using passive audio data captured by solar-powered IoT sensors, thereby reducing reliance on manual ecological surveys. The framework comprises four modules: acoustic data acquisition, dual-representation preprocessing Short-Time Fourier Transform (STFT) and Mel-Frequency Cepstral Coefficients (MFCCs), species classification using CNN and CNN-LSTM models, and anomaly detection via autoencoders and one-class SVM. Field validation and multi-dataset testing were conducted across 250+ species from temperate forests, wetlands, and urban areas. The CNN-LSTM model achieved the highest performance with 93.7% accuracy, 93.0% precision, and a 92.5% F1-score, while anomaly detection reached 89.7% precision with an AUC of 0.94, effectively identifying irregularities such as invasive calls, mechanical noise, and species absence. A forest case study demonstrated the system’s ability to detect circadian acoustic patterns (e.g., dawn chorus of sparrows, nocturnal owl calls), and real-world disturbances with 91% expert validation agreement. The novelty of this work lies in its hybrid AI architecture with real-time unsupervised anomaly detection, cross-biome generalization capability, and deployment readiness on low-power edge devices like Raspberry Pi and Jetson Nano. Inference times as low as 18 ms per sample and bandwidth usage under 3 MB/hour make it feasible for continuous, remote deployment. The framework offers a robust and adaptable solution for conservation efforts, environmental policy, and climate resilience initiatives. Future directions include integrating multimodal data sources and transformer-based continual learning for broader ecological impact. These findings position the system as a scalable and intelligent tool for next-generation, AI-driven environmental monitoring.
Dispute on Security Framework Model of MFCC Mixed Methods in Speech Recognition System  Pratiwi, Heni Ispur; Kartowisastro, Iman Herwidiana; Soewito, Benfano; Budiharto, Widodo
Journal of Applied Data Sciences Vol 6, No 3: September 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i3.689

Abstract

An audio recording system device has unprecedented activities of its authorized users which in a particular way cause vulnerability to the system. It starts to get into a fuzzy condition and deteriorate the system sensitivity in detecting unauthorized access to pass through, then the system inclination may occur. One case is when separate users picked speech voices with similar keywords to set their usernames or password. Moreover, when users are siblings or twins that could have merely similar voices.  Troublesome of this situation leads to a less sensitive manner of a security system, and in some situations, the system could operate blocking authorized users themselves to get access. This paper defines a proposed method to resolve the situation by combining Mel Frequency Cepstral Coefficient with other methodologies, which have been implemented for many other research’ specific objectives as well. This paper displays to prove its combination with an interval scoring in Fuzzy Relation complements a resolution to tackle the security of fuzzy issues mentioned. The Mel Scale has its capacity of delivering extractions output from audio input data, it is called as spectral centroids which refer to humans’ voices or an individual's voice features. Some spectral centroids get merely similar results due to those inclinations mentioned. This paper exposes Fuzzy Relation method to fit the need of verification procedures thorough its interval scale on any fuzzy features. The objective of verification procedure is to gain consistency measured scales, and security warrant remains valid. The inhouse experiments served to give user A of [0.49, 1.18] interval, user B of [0.76,1.07] interval, and user C of [0.44,0.95] interval, and those interval numbers are proposed to cap other login users accounts unto theirs.
HOG feature extraction in optimizing FK-NN and CNN for image identification of rice plant diseases Gama, Adie Wahyudi Oktavia; Gunawan, Putu Vina Junia Antarista; Darmaastawan, Kadek
Journal of Applied Data Sciences Vol 6, No 3: September 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i3.722

Abstract

This study compares the performance of FK-NN and CNN models in identifying rice diseases from digital images, focusing on both effectiveness and efficiency. Additionally, this research utilizes HOG for feature extraction from the digital images. The stages include data collection, preprocessing, transformation, classification, and model evaluation. The results show that the FK-NN model achieves a higher accuracy of 86.26%, compared to the CNN model's accuracy of 71.37%. Furthermore, the precision value of the FK-NN model is also higher at 86.88%, compared to the CNN model’s precision of 72.74%. Similarly, the recall value for the FK-NN model is higher at 86.88%, compared to the CNN model’s 71.37%. The F1-score of the FK-NN model is likewise superior, with a value of 86.88%, compared to the CNN model’s F1-score of 71.37%. These findings suggest that the FK-NN model with HOG feature extraction is more effective. However, in terms of inference time, the CNN model is faster, taking 0.000282 seconds compared to FK-NN’s 0.002331 seconds. In conclusion, the FK-NN model with HOG feature extraction excels in identifying rice diseases, while the CNN model offers faster inference time in this study.
Sentiment and Emotion Classification Model Using Hybrid Textual and Numerical Features: A Case Study of Mental Health Counseling Ramayanti, Indri; Hermawan, Latius; Syakurah, Rizma Adlia; Stiawan, Deris; Meilinda, Meilinda; Negara, Edi Surya; Fahmi, Muhammad; Ghiffari, Ahmad; Rizqie, Muhammad Qurhanul
Journal of Applied Data Sciences Vol 6, No 3: September 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i3.632

Abstract

Mental health issues among individuals, particularly in counseling contexts, require practical tools to understand and address emotional states. This study explores the application of machine learning models for emotion detection in mental health counseling conversations, focusing on four algorithms: Bernoulli Naive Bayes, Decision Tree, Logistic Regression, and Random Forest. The dataset, derived from transcribed counseling sessions, underwent preprocessing, including stemming, stopword removal, and TF-IDF vectorization to create structured inputs for classification. Emotional categories such as "Depresi" (Depression), "Kecewa" (Dissapointed), "Senang" (Happy), "Bingung" (Confused) and "Stres" (Stress) were analyzed to evaluate model performance. Results indicated that Logistic Regression achieved the highest accuracy at 82%, showcasing its reliability and scalability, followed closely by Random Forest with 81%, demonstrating robustness in handling complex data structures. Bernoulli Naive Bayes performed competitively at 80%, excelling in computational efficiency, while Decision Tree recorded the lowest accuracy at 70%, reflecting its limitations in managing overlapping features and high-dimensional data. These findings highlight the potential of machine learning in addressing the increasing demand for scalable mental health support tools. The study underscores the importance of model selection, balanced datasets, and feature engineering to improve classification accuracy. Future work includes developing AI-driven chatbots for real-time emotion detection and integrating multimodal data to enhance interpretability. This research contributes to advancing automated solutions for mental health care, offering new pathways for timely and personalized interventions.
A Study to Detect Multi-word Expression from Text Using Deep Learning Models Jun Meng, Wong; Yu Jie, Tan; Tong Ming, Lim
Journal of Applied Data Sciences Vol 6, No 3: September 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i3.716

Abstract

Detecting Multi-word Expressions (MWEs) is a crucial task in Natural Language Processing (NLP) for applications in machine translation, sentiment analysis, and information retrieval. This study evaluates the performance of several deep learning models on MWE detection using two samples of varying sizes from the major consumer electronic product retailer corpus. The sample is limited to 10,000 and 15,000 rows, with each row contains 15-20 English words. Preprocessing steps include removing special symbols and emojis, converting text to lowercase, and applying the spaCy NLP library for tokenization and part-of-speech (POS) tagging. Syntactic rules are then used to identify MWEs such as verb-noun combinations and phrasal verbs, with BIO tags (B-MWE, I-MWE, O) to mark MWE boundaries. We investigated transformer-based models such as BERT, BERT-CRF, LSTM-CRF and RoBERTa-CRF using a sample of 10,000 rows; BERT, BERT-BiLSTM, BiLSTM-GloVe, and BiLSTM-GloVe-BiGRU uses a sample of 15,000. Results demonstrated that the transformer-based model, RoBERTa-CRF, excels on the smaller sample which achieves the best performance by leveraging the contextual embeddings and sequential dependency modeling. On a larger sample, the BERT-BiLSTM model emerged as the most effective model, showcasing the advantage of combining dynamic embeddings with sequential learning. In contrast, models utilizing static embeddings, such as GloVe, displayed moderate performance, highlighting their limitations in capturing contextual nuances. Comparative analysis across both samples reveals that transformer-based models like RoBERTa-CRF performed optimally on the smaller dataset, whereas hybrid models integrating with sequential architectures like BERT-BiLSTM demonstrated superior performance as dataset size increased. These findings highlight the importance of model selection based on dataset scale to optimize MWE detection. This study underscores the importance of integrating contextual and sequential deep learning techniques to improve MWE detection and provides a basis for developing more robust and scalable systems for diverse linguistic tasks.
Implementation of Naïve Bayes Gaussian Algorithm for Real-Time Classification of Broiler Cage Conditions Rosmasari, Rosmasari; Prafanto, Anton; Firdaus, Muhammad Bambang
Journal of Applied Data Sciences Vol 6, No 3: September 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i3.694

Abstract

Monitoring large-scale broiler farms poses considerable challenges due to the variable nature of environmental conditions, which have a direct impact on poultry health and productivity. This study proposes a real-time classification system for broiler house conditions, utilizing the Naïve Bayes Gaussian algorithm in conjunction with the Internet of Things (IoT) technology. The system has been developed to address the limitations of manual monitoring by automating the collection of temperature, humidity, and ammonia data through BME-680 and MICS-5524 sensors, which are strategically positioned 30 cm from the floor to optimize ammonia detection. Utilizing a dataset comprising 865 records, meticulously divided into 75% for training (648 records) and 25% for testing (217 records), the model attained an accuracy of 82.03%, a precision of 75.67%, a recall of 82.67%, and an F1-score of 77.67%. A comparative analysis was conducted, which demonstrated significant advantages over alternative classification methods, with Decision Trees achieving 79.5% accuracy and Support Vector Machines reaching 80.8%. The innovation lies in the integration of automated condition classification into an IoT system, enabling rapid responses to environmental changes with processing times of approximately 500 milliseconds from sensing to classification. The system demonstrated an accuracy of 178 data points, with a misclassification rate of 39 out of 217 test samples. The strategic placement of sensors at a height of 30 cm optimizes ammonia detection while ensuring accurate temperature and humidity readings. The system provides historical data, enabling farms to analyze long-term environmental trends, and thereby support data-driven decision-making strategies to enhance broiler welfare and operational efficiency. Usability testing with five poultry farm operators confirmed the dashboard's intuitive design, though recommendations for visual alerts for critical ammonia levels were suggested for future iterations.
Development of a Self-Identity Construction Model for Private Vocational College Students Using Data Science Techniques Chen, Mei; Sangsawang, Thosporn
Journal of Applied Data Sciences Vol 6, No 3: September 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i3.731

Abstract

This study aimed to synthesize theories of self-identity learning to develop a self-identity development model for private vocational college students in Yunnan Province, China, identify key influencing factors, and evaluate the model's effectiveness. Using purposive sampling, the study involved 17 experts and 1,004 first-year students. Data were collected through a semi-structured questionnaire via Delphi Technique, supported by consultations via email, WeChat video, and in-person interviews. The model’s validity was assessed based on satisfaction levels from students, teachers, and stakeholders. Statistical analyses included weight calculations, means, standard deviations, coefficients of variation, and path analysis. The results showed strong expert consensus, with an average score of M = 4.5008 and CV = 0.1181, forming a model of 27 first-level and 21 second-level indicators. The "career development expectation evaluation" held the highest weight at 26.86% in the initial assessment, while "dynamic feedback loop development" recorded the highest importance at 0.442 in the practical development phase. Practical testing demonstrated significant effectiveness, with satisfaction means ranging from M = 4.059 to 4.341. Regression analysis confirmed significant mutual influences among the model's five modules. Overall, the model effectively addresses the urgent need for personalized development strategies for private vocational college students in Yunnan Province.