Claim Missing Document
Check
Articles

Found 3 Documents
Search
Journal : JOIV : International Journal on Informatics Visualization

Multi-Head Attention in Residual Networks to Improve Coral Reef Structure Classification Nuranti, Eka Qadri; Intizhami, Naili Suri; Tassakka, Muhammad Irpan Sejati; Areni, Intan Sari; Al Ghozy, Osama Iyad; Jefri, Muhammad Rivaldi
JOIV : International Journal on Informatics Visualization Vol 8, No 2 (2024)
Publisher : Society of Visual Informatics

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.62527/joiv.8.2.2392

Abstract

Residual Networks (ResNet) mark a crucial advancement in convolutional neural network architecture, effectively tackling challenges like vanishing gradients for improved pattern detection in various image classification tasks. This study introduces a novel adaptation of the ResNet50 architecture that integrates a multi-head attention mechanism (MHA), coined MHA-ResNet50, for discerning coral reef structures within images. Strategic modifications are applied to the input of each stage, leading to the development of an MHA block, which is augmented by separable convolution. The deliberate inclusion of the MHA block at various stages in identity-block Resnet50, in adherence to multiscale gate principles, precedes its traversal through fully connected layers. Furthermore, we implemented the Stratified K-fold concept to ensure that each fold has a comparable proportion of each class. We successfully assessed the efficacy of the MHA-Resnet50 model in several MHA-block placement scenarios and saw improvements in the accuracy of coral reef structure predictions. The most optimal results were achieved by incorporating four attention blocks (MHA-ResNet50-4), yielding an accuracy rate of 85.23% in recognition of coral structure images, comprising a mere 409 images. This model showcases adaptability to small datasets while delivering commendable performance. The ResNet50 architecture undergoes enhancement in our proposed model by integrating multi-head attention, separable convolution, and multiscale gate principles. The MHA-ResNet50 model substantially advances accurately predicting coral reef structures, demonstrating adaptability to limited datasets. Future lines of this research involve digging deeper into the model design and using more significant amounts and classes of data to strengthen a more comprehensive range of generalizations.
Fermented and Unfermented Cocoa Beans for Quality Identification Using Image Features Basri, Basri; Indrabayu, Indrabayu; Achmad, Andani; Areni, Intan Sari
JOIV : International Journal on Informatics Visualization Vol 8, No 3 (2024)
Publisher : Society of Visual Informatics

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.62527/joiv.8.3.2578

Abstract

Fermented cocoa bean products are one of the high-quality requirements of the cocoa processing industry. On an automated industrial scale, early identification of cocoa bean quality is essential in the processing industry. This study aims to identify the condition of quality cocoa beans based on fermentation and non-fermentation characteristics. This study applies analysis based on static images taken using a camera with a distance variation of 5 cm, 10 cm, and 15 cm in both classes, with 500 image data each. The Feature extraction Approach uses the Oriented Gradient (HOG) method with a Support Vector Machine (SVM) classification technique. Image analysis of both object classes was also performed with a color change to show the dominance of the color pattern on the skin of the cocoa beans to be analyzed. The results showed that fermented cocoa beans show a color pattern and texture that tends to be darker and coarser than non-fermented cocoa beans. Computational results with performance analysis using Receiver Operating Characterisic (ROC) on both classes showed the results that the distance of 5 cm and 15 cm has 100% accuracy, but based on the best performance, comprehensively seen in terms of Precision, Recall, and F1-Score shows the best value is at a distance of 15 cm. The results of this research based on the literature review conducted have better achievements, thus enabling further research on the development of conveyor models with real-time video data for automation systems.
Hybrid Deep Learning Approach For Stress Detection Model Through Speech Signal Chyan, Phie; Achmad, Andani; Nurtanio, Ingrid; Areni, Intan Sari
JOIV : International Journal on Informatics Visualization Vol 7, No 4 (2023)
Publisher : Society of Visual Informatics

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.62527/joiv.7.4.2026

Abstract

Stress is a psychological condition that requires proper treatment due to its potential long-term effects on health and cognitive faculties. This is particularly pertinent when considering pre- and early-school-age children, where stress can yield a range of adverse effects. Furthermore, detection in children requires a particular approach different from adults because of their physical and cognitive limitations. Traditional approaches, such as psychological assessments or the measurement of biosignal parameters prove ineffective in this context. Speech is also one of the approaches used to detect stress without causing discomfort to the subject and does not require prerequisites for a certain level of cognitive ability. Therefore, this study introduced a hybrid deep learning approach using supervised and unsupervised learning in a stress detection model. The model predicted the stress state of the subject and provided positional data point analysis in the form of a cluster map to obtain information on the degree using CNN and GSOM algorithms. The results showed an average accuracy and F1 score of 94.7% and 95%, using the children's voice dataset. To compare with the state-of-the-art, model were tested with the open-source DAIC Woz dataset and obtained average accuracy and F1 scores of 89% and 88%. The cluster map generated by GSOM further underscored the discerning capability in identifying stress and quantifying the degree experienced by the subjects, based on their speech patterns