Siregar, Tarq Hilmar
Unknown Affiliation

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Attention Augmented Deep Learning Model for Enhanced Feature Extraction in Cacao Disease Recognition Robet, Robet; Perangin Angin, Johanes Terang Kita; Siregar, Tarq Hilmar
Sinkron : jurnal dan penelitian teknik informatika Vol. 9 No. 4 (2025): Articles Research October 2025
Publisher : Politeknik Ganesha Medan

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.33395/sinkron.v9i4.15249

Abstract

Accurate cacao disease recognition is critical for safeguarding yields and reducing losses. Prior cacao studies primarily rely on handcrafted descriptors (eg, Color Histogram, LBP, GLCM) or standard CNN/transfer-learning pipelines, often limited to ≤ 3 classes and a single plant organ; explicit channel-spatial attention and comprehensive multiclass evaluation remain uncommon. To the best of our knowledge, no prior work integrates Squeeze-and-Excitation (SE) and the Convolutional Block Attention Module (CBAM) on a ResNeXt50 backbone for six-class cacao disease classification, accompanied by a standardized ablation study and t-SNE-based interpretability. We propose a six-class classifier (five diseases + healthy) built on ResNeXt-50 enhanced with SE (channel recalibration) and CBAM (channel-spatial emphasis) to highlight lesion-relevant cues. The dataset comprises labeled leaf and pod images from public sources collected under field-like conditions; preprocessing includes resizing to 224x224, normalization, and augmentation (flips, small rotations, color jitter, random resized crops). Trained with Adam and early stopping, ResNeXt50+SE+CBAM attains 97% test accuracy and 0.97 macro-F1, surpassing a ResNeXt50 baseline of 94% and 0.95 and SE-only/CBAM-only variants. Confusion matrix and t-SNE analyses show fewer mix-ups among visual classes and clearer separability, while the ablation validates complementary benefits of SE and CBAM. On a desktop-hosted, web-based setup, batch-1 inference at 224x224 is 7.46 ms/image (134 FPS), demonstrating real-time capability. The findings support deployment as browser-based decision-support tools for farmers and integration into continuous field-monitoring systems.