Claim Missing Document
Check
Articles

Found 2 Documents
Search

A Novel Encoder Decoder Architecture with Vision Transformer for Medical Image Segmentation Saroj Bala; Arora, Kumud; R, Jeevitha; Chowdhury, Rini; Kumar, Prashant; Nageswari, C.Shobana
Journal of Electronics, Electromedical Engineering, and Medical Informatics Vol 7 No 1 (2025): January
Publisher : Department of Electromedical Engineering, POLTEKKES KEMENKES SURABAYA

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35882/jeeemi.v7i1.571

Abstract

Brain tumor image segmentation is one of the most critical tasks in medical imaging for diagnosis, treatment planning, and prognosis. Traditional methods for brain tumor image segmentation are mostly based on Convolution Neural Network (CNN), which have been proved very powerful but still have limitations to effectively capture long-range dependencies and complex spatial hierarchies in MRI images. Variability in the shape, size, and location of tumors may affect the performance and may get stuck into suboptimal outcomes. In these regards, new encoder-decoder architecture with the VisionTranscoder(ViT) is proposed, to enhance brain tumor detection and classification. The proposed VisionTranscoder exploits a transformer's ability in modeling global context through self-attention mechanisms, providing more inclusive interpretation of the intricate patterns in medical images and classification by capturing both local and global features. The proposed VisionTranscoder maintains the Vision Transformer in its encoder for processing images as sequences of patches to capture global dependencies often outside the view of traditional CNNs. Then the segmentation map is rebuilt at a high level of fidelity with the decoder through upsampling and skips connections to maintain detailed spatial information. The risk of overfitting is hugely reduced by design and advanced regularization techniques with extensive data augmentation. The dataset contains 7,023 human brain MRI images, all of which are in four different classes: glioma, meningioma, no tumor, and pituitary. Images from the 'no tumor' class, indicating an MRI scan without any detectable tumor, were taken from the Br35H dataset . The results show the efficiency of VisionTranscoder over a wide set of brain MRI scans, producing an accuracy of 98.5% with a loss of 0.05. This performance underlines the ability of it to accurately segment and classify a brain tumor without overfitting.
Improving Kidney Stone Detection with YOLOV10 and Channel Attention Mechanisms in Medical Imaging Bala, Saroj; Arora, Kumud; V, Satheeswaran; S, Mohan; J, Deepika; K, Sangamithrai; Doss, Amala Nirmal
Journal of Electronics, Electromedical Engineering, and Medical Informatics Vol 7 No 3 (2025): July
Publisher : Department of Electromedical Engineering, POLTEKKES KEMENKES SURABAYA

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35882/jeeemi.v7i3.868

Abstract

Accurate and timely detection of kidney stones is crucial for effective medical intervention and treatment planning. However, existing detection methods often struggle with challenges related to sensitivity, precision, and the ability to process complex and variable medical images. In this study, an advanced kidney stone detection system is developed using the latest object detection algorithm, You Only Look Once version 10 (YOLOv10), integrated with channel attention mechanisms to enhance model performance. This combination aims to improve detection accuracy by enabling the network to focus more precisely on critical regions in medical images, particularly in Computed Tomography (CT) scans, where kidney stones may appear in varying shapes, sizes, and intensities. The proposed system begins with data augmentation techniques, such as rotation, scaling, and contrast adjustments, to enhance the model’s generalization ability across different image conditions and patient profiles. YOLOv10 was selected due to its lightweight architecture, high detection speed, and enhanced performance in small object detection tasks. To further improve feature extraction, channel attention mechanisms such as Squeeze-and-Excitation (SE) blocks or Efficient Channel Attention (ECA) modules are incorporated. These modules enable the network to selectively focus on the most informative feature channels associated with kidney stone regions, while suppressing irrelevant background information, thereby improving the distinction between stones and surrounding tissues. The model is trained and fine-tuned using a diverse CT scan dataset containing various types and sizes of kidney stones. Evaluation results demonstrate that the proposed model achieves a high detection accuracy of 93.7% with a very low loss of 0.18. It exhibits stability without issues like overfitting, underfitting, or local minima entrapment, making it a highly reliable tool for clinical applications.