cover
Contact Name
Huzain Azis
Contact Email
huzain.azis@umi.ac.id
Phone
+628114484875
Journal Mail Official
ijaimi.journal@gmail.com
Editorial Address
Jln. Paccerakkang Daya No.140, Kel. Berua Kec. Biringkanaya, Makassar, Sulawesi Selatan, Indonesia
Location
Unknown,
Unknown
INDONESIA
International Journal of Artificial Intelligence in Medical Issues
Published by yocto brain
ISSN : -     EISSN : 30254167     DOI : https://doi.org/10.56705
Core Subject : Health, Science,
The International Journal of Artificial Intelligence in Medical Issues (IJAIMI) is a premier, peer-reviewed academic journal dedicated to the integration and advancement of artificial intelligence (AI) in the medical field. The journal aims to serve as a global platform for researchers, clinicians, engineers, and other professionals to share their findings, methodologies, and innovations related to AI application in medical diagnostics, treatment, patient care, and health systems
Articles 48 Documents
Hybrid Feature Benchmark for Blood Cell Classification Using ResNet50 and EfficientNetV2 Features with SVM and ANN Classifiers via Unsupervised Segmentation Ahmad Kholish Fauzan Shobiry; Rahma Puspitasari
International Journal of Artificial Intelligence in Medical Issues Vol. 3 No. 2 (2025): International Journal of Artificial Intelligence in Medical Issues
Publisher : Yocto Brain

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.56705/ijaimi.v3i2.364

Abstract

Automated blood cell classification supports hematological diagnosis by providing objective and efficient analysis, but end-to-end deep learning models often require substantial computational resources that limit deployment on low-resource clinical devices. This study evaluates whether frozen deep features extracted from EfficientNetV2B0 or ResNet50 provide better separability for the eight BloodMNIST classes, and examines which classical classifier offers the most practical balance of accuracy, model size, and training time. The BloodMNIST dataset, consisting of 11,959 training images, 1,712 validation images, and 3,421 test images, is processed using data augmentation and Otsu-based unsupervised segmentation before the resulting masks are replicated into three channels and passed into pretrained ImageNet CNNs used strictly as frozen feature extractors. The extracted features are classified using Support Vector Machine with grid search, K-Nearest Neighbor, Artificial Neural Network, and Random Forest, with performance assessed through accuracy, precision, recall, and F1-score. EfficientNetV2 with Support Vector Machine achieves the highest performance, reaching 76.8% test accuracy, 75.3% precision, 72.6% recall, and a 73.6% F1-score, while EfficientNetV2 with Artificial Neural Network provides a comparable 76.2% accuracy and a 73.0% F1-score with a compact 2 MB model size. These findings highlight a clear trade-off between accuracy, model size, and computational cost, demonstrating that hybrid deep-feature pipelines offer lightweight and effective solutions for blood cell classification in resource-constrained clinical settings
Explainable Machine Learning for Predicting the Mental Health Impact of AI and Digital Platform Usage among Students Agus Halid; Dwi Amalia Purnamasari; Ade Chandra Saputra; Nicodemus Mardanus Setiohardjo
International Journal of Artificial Intelligence in Medical Issues Vol. 4 No. 1 (2026): International Journal of Artificial Intelligence in Medical Issues
Publisher : Yocto Brain

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.56705/pxn6qg39

Abstract

The increasing use of artificial intelligence and digital platforms among students has created new opportunities for learning support, academic assistance, and digital interaction. However, intensive platform usage may also be associated with mental health concerns, sleep disruption, and negative effects on students’ daily life. This study aims to develop and evaluate machine learning models for predicting the overall impact of AI and digital platform usage among students by integrating demographic, behavioral, sleep-related, and mental health-related variables. The dataset consisted of 1,705 student records with features including age, gender, academiclevel, country, average daily usage hours, most-used platform, sleep hours per night, and mental health score. The target variable was Overall_Impact, categorized into Negative, Neutral, and Positive classes. Six supervised machine learning algorithms were evaluated: Logistic Regression, Decision Tree, Random Forest, Support Vector Machine, K-Nearest Neighbors, and Gradient Boosting. Model performance was assessed using accuracy, precision, recall, F1-score, Cohen’s Kappa, MAE, RMSE, ROC-AUC, and confusion matrix. The results showed that Random Forest achieved the best performance, with an accuracy of 99.71%, F1-macro of 99.52%, Cohen’s Kappa of 0.9950, and ROC-AUC of 0.9994 on the testing set. Feature importance analysis revealed that Mental_Health_Score, Sleep_Hours_Per_Night, and Avg_Daily_Usage_Hours were the most influential predictors. The findings indicate that machine learning can effectively predict the impact of digital platform usage and provide useful insights for AI-driven health informatics and student well-being monitoring. However, further validation using longitudinal and clinically grounded datasets is recommended.
Leakage-Aware and Explainable Machine Learning for Healthcare Claim Fraud Detection Using Imbalanced Medical Insurance Data Dian Hafidh Zulfikar; Ery Setiyawan Jullev Atmadji; Bagus Satrio Wahyu Poetro
International Journal of Artificial Intelligence in Medical Issues Vol. 4 No. 1 (2026): International Journal of Artificial Intelligence in Medical Issues
Publisher : Yocto Brain

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.56705/z3207345

Abstract

Healthcare insurance fraud is a critical challenge in health systems because fraudulent claims may cause financial losses, increase administrative burden, and reduce trust in healthcare services. This study proposes an explainable machine learning approach for detecting fraudulent healthcare insurance claims using imbalanced medical claim data. The dataset consisted of 10,000 healthcare insurance claim records with 20 attributes, including patient information, provider characteristics, claim-related financial variables, medical codes, temporal features, and fraud labels. Fraudulent claims represented only 8.29% of the dataset, indicating a clear class imbalance problem. Several machine learning models were evaluated, including Logistic Regression, Decision Tree, Random Forest, Extra Trees, and AdaBoost, under different imbalance handling strategies, namely baseline learning, class weighting, and SMOTE. In addition, two feature scenarios were compared: a full-feature scenario and a leakage-aware scenario that excluded potentially post-decision variables such as claim status and approved amount. The experimental results showed that the best full-feature model was Logistic Regression without additional imbalance handling, achieving an accuracy of 0.9900, precision of 0.9740, recall of 0.9036, F1-score of 0.9375, ROC-AUC of 0.9989, and PR-AUC of 0.9896. The model correctly detected 150 out of 166 fraudulent claims in the test set. However, the best leakage-aware model achieved a lower F1-score of 0.6983, indicating that potentially leaked variables may substantially affect model performance. Feature importance analysis showed that claim amount, approved amount, claim submission delay, claim status, and provider-related variables were among the most influential predictors. These findings demonstrate that explainable machine learning can support healthcare claim fraud detection, but careful attention must be given to class imbalance, data leakage, and operational deployment context
Confidence-Aware Depression Severity Detection in Low-Resource Urdu Social Media Text: A Multilingual Machine Learning Approach Ahmad Naswin; Yuli Praptomo Pamungkas Hari Sungkowo
International Journal of Artificial Intelligence in Medical Issues Vol. 4 No. 1 (2026): International Journal of Artificial Intelligence in Medical Issues
Publisher : Yocto Brain

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.56705/62qxjt74

Abstract

Depression is a major mental health concern that requires early identification and timely intervention. Social media has become an important source of user-generated text that may reflect emotional distress, hopelessness, social withdrawal, and suicidal ideation. However, most existing depression detection studies focus on English or high-resource languages, while research on low-resource languages such as Urdu remains limited. This study investigates depression severity classification in Urdu social media text using multilingual and confidence-aware natural language processing approaches. The dataset consists of 4,000 Twitter/X posts collected between January 2024 and April 2025, annotated into four severity classes: none, mild, moderate, and severe. Each post is represented in three parallel textual forms: native Urdu script, Roman Urdu transliteration, and English translation. The dataset also includes label confidence scores, human verification indicators, cultural markers, and depression-related keywords. Several text representation scenarios were evaluated, including Urdu text, Roman Urdu text, English text, and combined multilingual features. Baseline machine learning models were developed using TF-IDF features with Logistic Regression, Linear Support Vector Machine, and Multinomial Naive Bayes. Confidence-aware learning was examined by incorporating label confidence scores as sample weights and by evaluating a high-confidence subset. The experimental results showed that all baseline models achieved perfect classification performance, with accuracy, macro F1-score, weighted F1-score, and Cohen’s Kappa values of 1.000 across the evaluated scenarios. These results indicate that the dataset contains highly separable linguistic patterns among depression severity classes. However, further inspection suggests that repeated or highly similar textual patterns may contribute to overly optimistic performance. Therefore, stricter validation using duplicate-free splitting, external datasets, and transformer-based models is recommended for future work. This study provides a preliminary benchmark for multilingual depression severity classification in low-resource Urdu text and highlights the potential of AI-driven mental health informatics as a supportive early-warning tool rather than a clinical diagnostic system
Gender-Aware Prediction of Liver Disease Using Machine Learning and Clinical Laboratory Data Umar Zaky; Muhammad Habibi; Adri Priadana; Thomas Edyson Tarigan
International Journal of Artificial Intelligence in Medical Issues Vol. 4 No. 1 (2026): International Journal of Artificial Intelligence in Medical Issues
Publisher : Yocto Brain

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.56705/wtsdw234

Abstract

Liver disease is a major health problem that may progress silently and lead to severe clinical complications if not detected early. Machine learning offers a promising approach for supporting early screening by identifying predictive patterns from clinical and biochemical patient data. This study developed an explainable gender-aware machine learning framework for liver disease prediction using demographic information and clinical biomarkers. The dataset consisted of 570 patient records after duplicate removal, including age, gender, total bilirubin, direct bilirubin, alkaline phosphatase, SGPT, SGOT, total protein, albumin, albumin/globulin ratio, and liver disease status. Several machine learning algorithms were evaluated under three experimental scenarios: original data, class-weighted learning, and SMOTENC-based oversampling. Model performance was assessed using accuracy, precision, recall, specificity, F1-score, and ROC-AUC. The experimental results showed that Gradient Boosting combined with SMOTENC achieved the best F1-score, with an accuracy of 0.7632, precision of 0.7935, recall of 0.9012, specificity of 0.4242, F1-score of 0.8439, and ROC-AUC of 0.7759. The model correctly identified 73 of 81 liver disease cases in the testing set, indicating strong sensitivity for early screening. Gender-based evaluation showed comparable F1-scores for male and female patients, with values of 0.8430 and 0.8462, respectively. Feature importance analysis identified SGOT, alkaline phosphatase, age, and direct bilirubin as the most influential predictors. These findings suggest that an explainable and gender-aware machine learning approach can support liver disease risk prediction using routinely available clinical biomarkers, although further validation using larger and more balanced datasets is required
Depression Risk Prediction Among Teenagers Using Explainable Machine Learning and Imbalanced Behavioral Data Rudi Setiawan; Effan Najwaini; Rezania Agramanisti Azdy; Rasmiati Rasyid
International Journal of Artificial Intelligence in Medical Issues Vol. 4 No. 1 (2026): International Journal of Artificial Intelligence in Medical Issues
Publisher : Yocto Brain

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.56705/0w9q4238

Abstract

Adolescent depression has become an important public health concern, particularly in relation to increasing digital media exposure, lifestyle changes, and psychosocial pressure. This study proposes an explainable machine learning framework for predicting depression risk among teenagers using social media usage, lifestyle behavior, and psychosocial indicators. The dataset consisted of 1,200 records with 13 variables, including age, gender, daily social media hours, platform usage, sleep hours, screen time before sleep, academic performance, physical activity, social interaction level, stress level, anxiety level, addiction level, and depression label. The target variable was highly imbalanced, with 1,169 samples categorized as non-depression and only 31 samples categorized as depression risk. Several machine learning models were evaluated, including Logistic Regression, Random Forest, Support Vector Machine, and Gradient Boosting. The experiments compared two feature settings, namely behavioral-only features and full features, combined with three imbalance handling strategies: no imbalance treatment, class weighting, and SMOTE. Model performance was evaluated using accuracy, precision, recall, F1-score, balanced accuracy, ROC-AUC, PR-AUC, Cohen’s Kappa, MAE, and RMSE. The results showed that the full-feature setting substantially outperformed the behavioral-only setting. The best performance was achieved by Random Forest using full features without imbalance handling, producing perfect classification results with accuracy, precision, recall, F1-score, ROC-AUC, and PR-AUC of 1.0000. Permutation importance analysis identified sleep hours, stress level, anxiety level, and daily social media hours as the most influential predictors. These findings indicate that teenage depression risk in this dataset is strongly associated with sleep behavior and psychosocial conditions, in addition to social media exposure. Although the model achieved excellent performance, the result should be interpreted cautiously due to the small number of positive depression-risk samples and the possibility of highly separable label patterns. Therefore, the proposed approach should be positioned as an early risk screening framework rather than a clinical diagnostic tool
A Comparative Study of Machine Learning Models for Stress Level Classification Using Social Media and Lifestyle Data M. Ikbal Siami; Aris Wahyu Murdiyanto; Sumiyatun
International Journal of Artificial Intelligence in Medical Issues Vol. 4 No. 1 (2026): International Journal of Artificial Intelligence in Medical Issues
Publisher : Yocto Brain

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.56705/r4d62a66

Abstract

The increasing use of social media and digital platforms has raised concerns regarding its potential relationship with sleep patterns, lifestyle behaviors, productivity, and psychological well-being. Stress is a common health-related issue that may be influenced by daily behavioral patterns, including screen time, social media usage, sleep duration, physical activity, and work or study habits. This study aims to develop and evaluate machine learning models for predicting stress levels based on non-invasive digital behavior and lifestyle indicators. The dataset used in this study consisted of 11,000 records with three stress level categories: Low, Medium, and High. The predictor variables included age, daily screen time, social media usage duration, sleep hours, exercise duration, study or work hours, productivity score, and the most frequently used social media platform. Several machine learning algorithms were evaluated, including Logistic Regression, Decision Tree, Random Forest, Support Vector Machine, K-Nearest Neighbors, and Gradient Boosting. Model performance was assessed using accuracy, precision, recall, F1-score, confusion matrix analysis, and 5-fold stratified cross-validation. The experimental results showed that the overall classification performance was modest. The Decision Tree model achieved the best testing performance with an accuracy and macro F1-score of 0.3400, while Gradient Boosting achieved the highest cross-validation performance with a mean accuracy of 0.3480 and a mean macro F1-score of 0.3467. Feature importance analysis using Random Forest indicated that productivity score, sleep hours, study or work hours, social media hours, and daily screen time were the most influential variables. These findings suggest that digital behavior and lifestyle indicators may provide useful exploratory insights for stress-related analysis, although their predictive power remains limited. Therefore, the proposed approach is more suitable as an exploratory digital well-being assessment framework rather than a clinical diagnostic tool.
Machine Learning-Based Clustering of Viruses Using Taxonomic and Genomic Features for Health Informatics Applications Adityo Permana Wibowo; Made Leo Radhitya; Edi Faizal; Ika Arfiani
International Journal of Artificial Intelligence in Medical Issues Vol. 4 No. 1 (2026): International Journal of Artificial Intelligence in Medical Issues
Publisher : Yocto Brain

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.56705/qstvhw47

Abstract

Viruses remain a major concern in global public health due to their potential to cause outbreaks, epidemics, and pandemics. The rapid organization and analysis of virus-related data are important for supporting computational virology, health informatics, and pandemic preparedness. This study proposes an unsupervised machine learning approach to cluster viruses based on taxonomic and genomic characteristics. The dataset consisted of 70 virus records with attributes including family, genus, genome type, strand type, and envelope status. Since the dataset did not contain predefined epidemiological labels or risk categories, the analysis was designed as an exploratory clustering task rather than a supervised prediction task. Data preprocessing was performed by removing duplicates, handling missing values, standardizing categorical attributes, and transforming selected features using One-Hot Encoding. Three clustering algorithms were evaluated, namely K-Means, Agglomerative Clustering, and DBSCAN. The clustering performance was assessed using Silhouette Score, Davies-Bouldin Index, and Calinski-Harabasz Score, while Principal Component Analysis was applied for two-dimensional visualization. The results showed that K-Means with 10 clusters achieved a Silhouette Score of 0.7725 and a Davies-Bouldin Index of 0.8186. Agglomerative Clustering obtained the highest Silhouette Score of 0.7754, while DBSCAN produced fewer clusters with lower overall performance. Several biologically meaningful groups were identified, including clusters representing Flaviviridae, Coronaviridae, Herpesviridae, Poxviridae, and enveloped RNA viruses. However, a large proportion of records contained unknown values, which influenced the formation of a dominant incomplete-data cluster. These findings indicate that taxonomic and genomic features can support machine learning-based virus grouping, although data completeness remains a critical factor. This study provides an initial computational framework for AI-driven viral data exploration and may serve as a foundation for future viral risk stratification using enriched epidemiological and clinical features.