Claim Missing Document
Check
Articles

Found 3 Documents
Search

A detailed analysis of deep learning-based techniques for automated radiology report generation Dhamanskar, Prajakta; Thacker, Chintan
International Journal of Electrical and Computer Engineering (IJECE) Vol 14, No 5: October 2024
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijece.v14i5.pp5906-5915

Abstract

The automated creation of medical reports from images of chest X-rays has the potential to significantly reduce workloads for healthcare providers and accelerate patient care, especially in environments with limited resources. This study provides an extensive overview of deep learning-based techniques designed for radiology report generation from chest X-ray pictures automatically. By examining recent research, we delve into various deep learning architectures and techniques used for this task, including transformer-based approaches, attention mechanisms, sequence-to-sequence models, adversarial training methods, and hybrid models. We also discuss about the datasets used for evaluation and training, as well as future directions and research problems in this area. The significance of deep learning in revolutionizing radiology reporting is further emphasized by our review, which also highlights the need for additional research to address challenges such data accessibility, image quality variability, interpretation of complex findings, and contextual integration. The objective of this research is to present a comparative analysis of cutting-edge methods for developing automated medical report generation to enhance patient outcomes and healthcare delivery.
A comprehensive survey on automatic image captioning-deep learning techniques, datasets and evaluation parameters Chauhan, Harshil; Thacker, Chintan
International Journal of Electrical and Computer Engineering (IJECE) Vol 15, No 3: June 2025
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijece.v15i3.pp3257-3266

Abstract

Automatic image captioning is a pivotal intersection of computer vision and natural language processing, aiming to generate descriptive textual content from visual inputs. This comprehensive survey explores the evolution and state-of-the-art advancements in image caption generation, focusing on deep learning techniques, benchmark datasets, and evaluation parameters. We begin by tracing the progression from early approaches to contemporary deep learning methodologies, emphasizing encoder-decoder based models and transformer-based models. We then systematically review the datasets that have been instrumental in training and benchmarking image captioning models, including MSCOCO, Flickr30k, Flickr8k, and PASCAL 1k, discussing image count, types of scenes, and sources. Furthermore, we delve into the evaluation metrics employed to assess model performance, such as bilingual evaluation understudy (BLEU), metric for evaluation of translation with explicit ordering (METEOR), recall-oriented understudy for gisting evaluation (ROUGE), and consensus-based image description evaluation (CIDEr), analyzing their domains, bases, and measurement criteria. Through this survey, we aim to provide a detailed understanding of the current landscape, identify challenges, and propose future research directions in automatic image captioning.
FedBrain-3DMRI: Federated Self-Supervised Learning for 3D Brain Tumor Segmentation using SCAFFOLD Algorithm Chaudhary, Neeshu; Thacker, Chintan
Journal of Electronics, Electromedical Engineering, and Medical Informatics Vol 8 No 2 (2026): April
Publisher : Department of Electromedical Engineering, POLTEKKES KEMENKES SURABAYA

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35882/jeeemi.v8i2.1596

Abstract

Brain tumor segmentation is the most important way to separate tumor areas from healthy brain tissue in medical imaging. This is necessary for making an accurate diagnosis and planning treatment. But building strong deep learning models is often hard because there isn't much labeled medical data available, and strict privacy rules stop data from being shared in one place. Federated Learning (FL) helps keep patient data private by keeping it local, but its performance often drops when data from different hospitals have big differences in quality, imaging protocols, and distribution. Our research seeks to create a privacy-preserving federated learning framework that adeptly manages significant data heterogeneity while ensuring high segmentation accuracy across various institutions. We propose a new two-stage FL framework that allows multiple institutions to work together while keeping their privacy and effectively dealing with complicated non-IID data distributions. To start, we use a Federated Masked Autoencoder (MAE) for self-supervised pre-training. This lets the model learn strong anatomical features from unlabeled MRI scans. Second, the model is carefully fine-tuned using an Attention ResUNet3D architecture to get very accurate tumor segmentation. We use the SCAFFOLD optimization algorithm to keep training stable across all clients, even when the scanner varies from site to site, thereby directly addressing client drift. We also use strategic foreground-biased sampling and Test-Time Augmentation (TTA) techniques to greatly improve segmentation accuracy in difficult, uneven tumor sub-regions. We ran extensive experiments on the BraTS 2024 dataset in simulated federated settings with 10, 50, and 100 different clients. The Dice coefficients we got were 0.826, 0.824, and 0.818, which demonstrate strong performance. In the end, these strong results show that the suggested method works well on a larger scale and can be used in a clinical setting.