Claim Missing Document
Check
Articles

Found 1 Documents
Search

Automated ICD Medical Code Generation for Radiology Reports using BioClinicalBERT with Multi-Head Attention Network D., Sasikala; N., Sarrvesh; J., Sabarinath; S., Theetchenya; S., Kalavathi
Journal of Electronics, Electromedical Engineering, and Medical Informatics Vol 7 No 3 (2025): July
Publisher : Department of Electromedical Engineering, POLTEKKES KEMENKES SURABAYA

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35882/jeeemi.v7i3.775

Abstract

International Classification of Diseases (ICD) coding plays a pivotal role in healthcare systems with its provision of a standard method for classifying medical diagnoses, treatments, and procedures. However, the process of manually applying ICD codes to clinical records is both time-consuming and error-prone, particularly considering the large magnitude of medical terminologies and the periodic changes to the coding system. This work introduces a Hierarchical Multi-Head Attention Network (HMHAN) that aims to automate ICD coding using domain-related embeddings with an attention mechanism. The proposed method uses BioClinicalBERT for feature extraction from clinical text and then a two-level attention mechanism to learn hierarchical dependencies between labels. BioClinicalBERT is pre-trained on large biomedical and clinical corpora that enable it to capture complex contextual relationships specific to medical language more effectively. The multi-head attention mechanism enables the model to focus on different parts of the input text simultaneously, learning intricate associations between medical terms and corresponding ICD codes at various levels. This method uses SMOTE (Synthetic Minority Oversampling Technique) based multi-label resampling to solve class imbalance. SMOTE generates synthetic examples for underrepresented classes, allowing the model to learn better from imbalanced data without overfitting. For this work, MIMIC-IV dataset of de-identified radiology reports and corresponding ICD codes are used. The performance of the model is assessed with F1 score, Hamming loss, and ROC-AUC metrics. Results obtained from the model with an F1 score of 0.91, Hamming loss of 0.07, and ROC-AUC of 0.92 show promising research directions to automate the ICD coding process. This system will improve the effectiveness of healthcare workflows by automating ICD code generation for advanced clinical care.