cover
Contact Name
Husni Teja Sukmana
Contact Email
husni@bright-journal.org
Phone
+62895422720524
Journal Mail Official
jads@bright-journal.org
Editorial Address
Gedung FST UIN Jakarta, Jl. Lkr. Kampus UIN, Cemp. Putih, Kec. Ciputat Tim., Kota Tangerang Selatan, Banten 15412
Location
Kota adm. jakarta pusat,
Dki jakarta
INDONESIA
Journal of Applied Data Sciences
Published by Bright Publisher
ISSN : -     EISSN : 27236471     DOI : doi.org/10.47738/jads
One of the current hot topics in science is data: how can datasets be used in scientific and scholarly research in a more reliable, citable and accountable way? Data is of paramount importance to scientific progress, yet most research data remains private. Enhancing the transparency of the processes applied to collect, treat and analyze data will help to render scientific research results reproducible and thus more accountable. The datasets itself should also be accessible to other researchers, so that research publications, dataset descriptions, and the actual datasets can be linked. The journal Data provides a forum to publish methodical papers on processes applied to data collection, treatment and analysis, as well as for data descriptors publishing descriptions of a linked dataset.
Articles 588 Documents
Applying Transfer Learning on Various GNN Model Training in Indoor Positioning System Tasks Kevin Wijaya; Hanif Muhammad Sangga Buana; Gede Putra Kusuma
Journal of Applied Data Sciences Vol 7, No 2: May 2026
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v7i2.1150

Abstract

Determining location and orientation has always been a fundamental challenge, driving advances from maps and compasses to modern global navigation satellite systems (GNSS). However, GNSS performs poorly indoors due to signal attenuation and lack of elevation accuracy, necessitating the development of indoor positioning systems (IPS). Various technologies such as Wi-Fi, Bluetooth Low Energy (BLE), and RFID have been deployed, typically relying on received signal strength (RSS) and fingerprinting to improve accuracy. While previous research focused on training a single model for an entire building, this study explores the creation of floor-specific models by applying transfer learning to various GNN models. This is done to address the substantial signal distortion between floors. Using the UTSIndoorLoc dataset, we evaluate Graph Attention Network (GAT), GraphSAGE, and Graph Convolutional Network (GraphConv) for predicting two-dimensional indoor positions based on RSSI fingerprints. We propose 2 transfer learning model training methods, Schema A and Schema B. Schema A trains the base model iteratively through each floor, and Schema B trains the base model on a unified dataset. Schema B with GraphConv achieved the best results with a mean positioning error of 6.2176 meters. Whilst Schema A achieved a best-case mean positioning error of 6.3900 meters. Both outperforming the standard unified model which has a mean positioning error of 8.0808 meters.
Development of Color Segmentation and Texture Analysis Algorithms for Early Detection of Green Vegetable Deterioration in Retail Environments Dinul Akhiyar; Iskandar Fitri; Gunadi Widi Nurcahyo
Journal of Applied Data Sciences Vol 7, No 2: May 2026
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v7i2.1094

Abstract

Vegetable deterioration in retail environments is often accelerated by improper storage conditions, leading to quality degradation, economic losses, and reduced consumer trust. Early detection of deterioration is therefore essential to enable timely preventive actions before visible spoilage becomes severe. This study proposes an integrated image-based framework for early detection of spinach leaf deterioration by combining K-Means++ for robust color segmentation, Gray Level Co-occurrence Matrix (GLCM) for texture feature extraction, and Convolutional Neural Network (CNN) for classification. K-Means++ improves segmentation stability through optimized centroid initialization, GLCM captures subtle texture variations associated with early spoilage, and CNN enables accurate classification by learning complex visual patterns from segmented images. The dataset consists of 642 spinach leaf images captured under controlled lighting for initial calibration and under varying lighting conditions to simulate real-world retail environments. Experimental results show that the standard K-Means algorithm achieved an average classification accuracy of 77%, while the proposed K-Means++ segmentation improved accuracy to 81.86%. Furthermore, CNN-based validation achieved the highest classification accuracy of 94.82%, demonstrating strong generalization capability. The novelty of this work lies in the optimized integration of K-Means++ segmentation under lighting variability, selective GLCM feature utilization validated through ablation analysis, and end-to-end CNN-based validation with real-time deployment feasibility. The proposed framework offers a practical, scalable, and non-destructive solution for automated freshness monitoring in retail environments and can be extended to other leafy vegetables.
Improved Hybrid GoogLeNet-Based Deep Learning Optimization for Standardized Straw Mushroom Quality Classification in Indonesia Bayu Priyatna; Titik Khawa Abdurahman; Muhammad Fahmi Miskon; April Lia Hananto; Agustia Tia Hananto; Aviv Yuniar Rahman
Journal of Applied Data Sciences Vol 7, No 2: May 2026
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v7i2.1206

Abstract

Deep learning plays a crucial role in modern computer vision due to its ability to automatically extract hierarchical features from large-scale image data. Among various architectures, Convolutional Neural Networks (CNNs) have been extensively utilized for image pattern interpretation, including in agricultural product inspection. Straw mushrooms (Volvariella volvacea) are important agro-industrial commodities in Indonesia; however, their quality assessment still relies on subjective manual evaluation based on the Indonesian National Standard (SNI:01-6945-2003), leading to inconsistency in grading results. To address this limitation, this research proposes an Improved Hybrid GoogLeNet model integrated with a YOLO-based detection framework and hybrid preprocessing to enhance feature clarity and classification robustness. The system is capable of conducting object detection, 3-class morphological quality classification (Pure White, Oval, and Black Spot/Defect), and automatic diameter measurement using calibrated pixel-to-centimeter conversion. Performance evaluation is carried out by benchmarking the proposed model against several popular deep learning architectures including YOLOv5, LeNet, AlexNet, VGGNet, and ResNet. Experimental results demonstrate that the Improved Hybrid GoogLeNet achieves the highest performance with precision of 97.99%, recall of 96.07%, and F1-score of 96.98%, along with low misclassification rates across all classes. These results indicate that the proposed method provides accurate, reliable, and efficient quality assessment that supports standardized automated grading in industrial applications. Therefore, this study contributes to the advancement of intelligent computer vision solutions for digital transformation in the Indonesian mushroom agro-industry.
A Hybrid Method for Low-Resource Named Entity Recognition Do Minh Duc; Quan Xuan Truong; Viet Tran Hong; Le Hoang Anh; Mac Thi Minh Tra; Nguyen Van Thuy; Le Hai Ha; Vinh Nguyen Van
Journal of Applied Data Sciences Vol 7, No 2: May 2026
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v7i2.1161

Abstract

Named Entity Recognition (NER) is a critical component of Natural Language Processing with diverse applications in information extraction and conversational AI. However, NER in specific domains for low-resource languages faces challenges such as limited annotated data and heterogeneous label sets. This study addresses these issues by proposing a hybrid neurosymbolic framework that integrates rule-based processing with deep learning models for Vietnamese NER. The core idea involves a two-stage pipeline: first, a rule-based component reduces label complexity by grouping relational and special categories; second, pre-trained language models are fine-tuned for high-precision extraction. A post-processing module is then utilized to restore fine-grained labels, preserving expressiveness for application-level usability. To mitigate data scarcity, a scalable data augmentation strategy leveraging Large Language Models (LLMs) is introduced to expand the label set without full re-annotation—a significant novelty of this work. The effectiveness of this method was evaluated across five specific-domain datasets, including logistics, wildlife, and healthcare. Experimental results demonstrate substantial improvements over strong RoBERTa-based baselines. Specifically, the proposed system achieved F1 scores of 90% in Customer Service (up from 83%), 84% in GAM (up from 73%), 83% in AI Fluent (up from 80%), 94% in PhoNER_Covid19 (up from 91%), and 60% in Rare Wildlife (up from 36%). These findings confirm that the hybrid approach effectively captures the linguistic complexity of Vietnamese and contextual nuances in specialized domains, offering a robust contribution to low-resource NER research.
A Hybrid YOLO–CNN Model for Automatic Detection and Severity Assessment of Atopic Dermatitis in Infant Images Debi Setiawan; Ramalia Noratama Putri; Sara Herlina; Achmad Nizar Hidayanto; Yuda Irawan; Naohiro Hohashi
Journal of Applied Data Sciences Vol 7, No 2: May 2026
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v7i2.1212

Abstract

Atopic dermatitis is one of the most common skin diseases affecting infants and children worldwide and has a particularly high prevalence in tropical countries. Traditional diagnosis methods, which still rely on physical examinations and laboratory tests often face challenges such as delays, high costs, and limited facilities, thereby necessitating an artificial intelligence–based system that is more efficient and accurate. This study aims to develop a hybrid YOLO–CNN model for the automatic detection and severity classification of atopic dermatitis in infants. The dataset comprises 2,000 infant skin images, including lesions categorized as mild, moderate, and severe, obtained from an online repository and field observations conducted in three villages. The labeling process was performed by a specialist doctor to ensure clinical validity. In the first step, YOLO was used to detect the lesion area in real time by generating a bounding box. This produced a region of interest (ROI), which was subsequently analyzed by a CNN model employing transfer learning in the second step to determine the severity level. Experimental results indicate that YOLO achieved high detection performance, with an mAP@0.5 of 91.2% and an F1-score of 90.2%, while the CNN model attained an average accuracy of 85% and a macro-F1 score of 85% in classification. The visualization of predictions indicates that most lesions were detected with confidence levels ≥0.9, confirming the model’s consistency. These findings highlight the potential of the hybrid YOLO–CNN framework as a supportive system for digital clinical diagnosis, applicable to both mobile applications and teledermatology services, particularly in regions with limited medical personnel. Future research should employ larger, multi-center datasets and integrate explainable AI approaches to promote broader clinical adoption.
Pattern Recognition of Puta Dino Fabric Using Web-Based Convolutional Neural Network Method Luther Alexander Latumakulita; Silviani Esther Rumagit; Hence Beedwel Lumentut; Frangky Jessy Paat; Jaidun Ramadhan Kaplale; Enny Itje Sela
Journal of Applied Data Sciences Vol 7, No 2: May 2026
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v7i2.1103

Abstract

This study aims to develop an intelligent system capable of recognizing traditional woven motifs of Puta Dino, a culturally significant textile from Tidore Island. These motifs are visually complex, poorly documented, and hard for the public to distinguish, highlighting the need for a digital tool to support cultural preservation and accurate identification. This research is the first to build a structured Puta Dino motif database and provide an integrated model designed for real-world use. The approach captured primary images of eight validated motifs and applied systematic preprocessing, including normalization and data augmentation, to enhance variability and strengthen the dataset. A lightweight deep learning model predicated on a convolutional neural network was designed to achieve a compromise between accuracy and computational efficiency. The system was evaluated through cross-validation and independent test data, as well as multiple real-world trials utilizing a web interface. These trials involved different image capture scenarios, including from a distance, moderate distance, close and angled views, and when the fabric surface was folded. The model architecture and system interface with the system are illustrated in the relevant figures, and the tables provide performance data on the system’s training, accuracy in motif classification, and achieved results in real-world conditions. The system demonstrated excellent classification accuracy in controlled test conditions. It showed real-world competency, accurately classifying most motifs in various conditions. The data also point to specific issues with motif recognition in extreme distortion cases, which reflect the typical issues of laboratory-to-field model deployment. The outcomes clearly demonstrate both the possibilities and the limitations of the currently available recognition of culturally significant textiles. The study concludes by exploring the possibilities of expanding the dataset and increasing the depth of learning through more sophisticated techniques, as well as enhancing accessibility to promote sustained community and cultural engagement.
Comparing Pre-Norm and Post-Norm Transformers in Preserving Gender Information for Indonesian–English Translation through Attention-Based Signal Reinforcement Andik Wijanarko; Rinaldi Munir; Masayu Leylia Khodra; Dessi Puji Lestari
Journal of Applied Data Sciences Vol 7, No 2: May 2026
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v7i2.1257

Abstract

Gender realization in Indonesian–English machine translation remains challenging due to the absence of grammatical gender in Indonesian, which often leads to unstable or ambiguous gender representations in English outputs. While Transformer-based models have demonstrated strong general translation performance, their ability to preserve gender information across encoding layers remains inconsistent and poorly understood, particularly with respect to architectural normalization strategies.This study presents a comparative analysis of Pre-Norm and Post-Norm Transformer architectures in preserving gender information, and examines the role of attention-based signal reinforcement in mitigating representational degradation. The reinforcement mechanism is introduced prior to standard encoder processing to strengthen gender-relevant token interactions without modifying the overall model structure.Four controlled configurations—Post-Norm, Pre-Norm, Post-Norm with attention-based reinforcement, and Pre-Norm with attention-based reinforcement—are trained under identical random seeds on both unbalanced and balanced datasets. Evaluation is performed on gender-ambiguous test sentences without explicit gender annotations to assess generalization. Gender preservation is assessed at the output level using gender-specific accuracy and BLEU score, and at the representation level using cosine similarity between gender cue embeddings and English gendered pronouns.The results show that Post-Norm Transformers fail to maintain stable gender representations, yielding near-random gender accuracy (~50%) and negligible BLEU scores. Pre-Norm architectures improve training stability but achieve limited gender accuracy (around 30%). Incorporating attention-based signal reinforcement substantially enhances gender preservation, with accuracy rising to over 50% and reaching up to 56% under balanced training conditions, accompanied by a consistent increase in cosine similarity values (exceeding 0.35) between gender cues and corresponding pronouns. These findings indicate that normalization strategy and attention-based reinforcement jointly determine the stability of gender representations in Transformer-based machine translation.
Determinants of Student Behavior to Use Financial Technology (Fintech) Banking Services - Integrated Theory of System Acceptance and Psychological Behavioral Theory Fachrurrozie Fachrurrozie; Indah Anisykurlillah; Hasan Mukhibad; Kuat Waluyo Jati; Ahmad Nurkhin
Journal of Applied Data Sciences Vol 7, No 2: May 2026
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v7i2.1138

Abstract

Fintech provides technology-based banking and financial services; therefore, the analysis of FinTech usage behavior should be viewed in the context of system acceptance and psychological behavioral theory. We employ the System Acceptance Theory approach, specifically the Theory of Acceptance Model, and the psychological behavioral theory, the Theory of Reasoned Action, to explain behavioral intention to use FinTech and incorporate risk factors. This study aims to prove the influence of perceived ease of use, perceived usefulness, subjective norms, attitude, and perceived risk on the intensity of Generation Z's intention to use Fintech. Moreover, this research demonstrates the influence of intention to use Fintech on fintech usage behavior. This research employed a survey approach with 350 students in Indonesia, who are part of Generation Z, and analyzed the data using Structural Equation Modeling with Partial Least Squares. We report that perceived ease of use and perceived usefulness are vital factors in increasing the intention to use Fintech. Attitude is a factor that encourages students to use Fintech, and conversely, perceived risk is a vital factor in decreasing intention to use Fintech. We were unable to find evidence of a relationship between subjective norms and intention to use Fintech. Ultimately, behavioral intention in using Fintech is crucial for increasing student adoption of Fintech. This study recommends that financial institutions offer Fintech services to enhance usability, convenience, and mitigate the risks associated with fintech use.
Adaptive Integration of Optuna Optimization and Stacking Ensemble Learning for Automated Work Competency Classification Mutiana Pratiwi; Sarjon Defit; Muhammad Tajuddin
Journal of Applied Data Sciences Vol 7, No 2: May 2026
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v7i2.1228

Abstract

Artificial intelligence and machine learning are increasingly used to automate analytical and decision processes, including the evaluation of human competencies. However, traditional models often face challenges in accuracy and generalization when applied to linguistic data from interviews. This study aims to develop a model that integrates Optuna optimization and stacking ensemble learning to enhance the accuracy and interpretability of competency classification. Interview transcript data were processed using natural language processing techniques such as cleaning, tokenization, case folding, stopword removal, and stemming to ensure textual consistency. The text was then transformed into numerical representations using term frequency inverse document frequency weighting. To handle class imbalance, the synthetic minority oversampling technique was employed. Optuna was applied to optimize the hyperparameters of base models, including support vector classifier, Naïve Bayes, random forest, gradient boosting, and XGBoost. These optimized models were combined through a stacking ensemble to form the final classifier. The proposed model achieved an accuracy of 94 percent and a precision of 95 percent with macro and weighted F1 scores of 0.94. The results demonstrate stable and balanced performance across all competency categories, including analytical thinking, initiating action, problem solving, and work standards. Comparative analysis with previous studies in sentiment analysis, medical diagnosis, and financial forecasting confirmed that the integration of Optuna and stacking produces more robust and generalizable outcomes. The integration of Optuna optimization and stacking ensemble learning effectively improves classification performance while maintaining interpretability. The model demonstrates strong potential for automated competency evaluation in recruitment and human resource analytics. This framework can be extended to other linguistic datasets to support transparent and data-driven decision-making in artificial intelligence applications.
An Integrated Text Analytics and Ensemble Machine Learning Framework for Fake Review Detection in Online Marketplaces Eka Praja Wiyata Mandala; Sarjon Defit; Gunadi Widi Nurcahyo
Journal of Applied Data Sciences Vol 7, No 2: May 2026
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v7i2.1143

Abstract

The increasing prevalence of fake reviews on e-commerce platforms undermines consumer trust and affects purchasing decisions, particularly for local products by limited visibility such as those by West Sumatra, Indonesia. This study proposes a hybrid approach combining text analytics and machine learning to enhance the detection of fake reviews. Four classification models—Naive Bayes, Random Forest, Logistic Regression, and K-Nearest Neighbor—were tested on a dataset of 1,500 labeled product reviews. Among these models, Random Forest had the highest starting accuracy of 0.8533. To enhance it, we created a better algorithm called EKAHypeRFor (Enhanced Knowledge Augmentation of Hyperparameter Random Forest). This method uses simple feature engineering and careful tuning of settings by RandomizedSearchCV. The enhanced model reached an accuracy of 0.8778, which is 2.45% higher than the original. It also includes a real-time review sorting tool, making it easy to use on online shopping sites. Tests by a confusion matrix and feature importance drawn the model works well and is easy to understand. This method is simple, fast, and accurate, helping to make online product reviews more trustworthy for small and medium businesses in the area.