Claim Missing Document
Check
Articles

SciBERT Optimisation for Named Entity Recognition on NCBI Disease Corpus with Hyperparameter Tuning Salam, Abu; Sidiq, Syaiful Rizal
Journal of Applied Informatics and Computing Vol. 9 No. 2 (2025): April 2025
Publisher : Politeknik Negeri Batam

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30871/jaic.v9i2.9283

Abstract

Named Entity Recognition (NER) in the biomedical domain faces complex challenges due to the variety of medical terms and their context of use. Transformer-based models, such as SciBERT, have proven to be effective in natural language processing (NLP) tasks in scientific domains. However, the performance of these models is highly dependent on proper hyperparameter selection. Therefore, the aim of this study is to analyse the impact of hyperparameter tuning on the performance of SciBERT in NER tasks on the NCBI Disease Corpus dataset. The methods used in this study include training the baseline SciBERT model without tuning, followed by hyperparameter optimisation using grid search, random search, and bayesian optimisation methods. Model evaluation is done with precision, recall, and F1-score metrics. The experimental results showed that of the three methods grid search and random search produced the best performance with a precision, recall and F1-score of 0.82, improving from the baseline which only achieved a precision and recall of 0.72 and F1-score of 0.68. This study confirms that proper hyperparameter tuning can improve model accuracy and efficiency in medical entity extraction tasks. These results contribute to the development of optimisation methods in biomedical text processing, particularly in improving the effectiveness of the SciBERT Transformer model for NER.
Pendampingan ibu - ibu PKK tentang Deteksi Kanker Serviks Melalui Software Aplikasi Hidayat, Erwin Yudi; Astuti, Yani Parti; Salam, Abu; Nugraha, Adhitya; Paramita, Cinantya; Octaviani, Dhita Aulia
Community : Jurnal Pengabdian Pada Masyarakat Vol. 4 No. 1 (2024): Maret : Jurnal Pengabdian Pada Masyarakat
Publisher : LPPM Sekolah Tinggi Ilmu Ekonomi - Studi Ekonomi Modern

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.51903/community.v4i1.495

Abstract

One feared cancer among mothers is cervical cancer. Cervical cancer is a type of cancer that occurs in the cervix of women. Due to its hidden location, women may not be able to detect early on whether they have cervical cancer or not. Meanwhile, this disease ranks among the top three causes of death in Indonesia. To determine early on whether a woman is affected by cervical cancer, various methods are employed. Some search for symptoms on social media, some take preventive measures with various herbal remedies to avoid cervical cancer, and many other actions are taken by women. With the touch of artificial intelligence technology, these issues can be addressed. Therefore, the dedicated team is trying to create an application that can detect cervical cancer. With this application, women can find out early on whether they have or are approaching cervical cancer. Although the accuracy of this application is not 100%, at least women can be aware of the detection of this disease and can promptly seek treatment or prevention. With the existence of this application, it is hoped that it can be beneficial for the mothers of PKK Perum Kandri Persona Asri RT 04 RW 04, who are the subjects of this dedication.
Pendampingan Pola Hidup Bersih dan Sehat (PHBS) pada Siswa MI Miftahul Hidayah dengan Sosialisasi Aplikasi Digital Rakasiwi, Sindhu; Salam, Abu; Subhiyakto, Egia Rosi; Dewi, Ika Novita; Octaviani, Dhita Aulia; Zeniarja, Junta
Community : Jurnal Pengabdian Pada Masyarakat Vol. 4 No. 1 (2024): Maret : Jurnal Pengabdian Pada Masyarakat
Publisher : LPPM Sekolah Tinggi Ilmu Ekonomi - Studi Ekonomi Modern

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.51903/community.v4i1.496

Abstract

Program Perilaku Hidup Bersih dan Sehat (PHBS) sangat penting untuk mendorong penerapan gaya hidup sehat untuk menjaga, memelihara, dan meningkatkan kesehatan. Banyak penyakit dapat dihindari apabila masyarakat menerapkan gaya hidup yang sehat. PHBS sangat ideal untuk diterapkan pada anak-anak pada usia sekolah, karena mereka termasuk ke dalam kelompok yang berisiko terhadap masalah kesehatan dikarenakan oleh beberapa faktor. Teknologi dalam pendidikan telah terbukti dapat mengubah cara interaksi dan pembelajaran dalam kelas secara signifikan, lebih efisien, lebih mudah diakses, dan dapat membangun keterampilan yang dibutuhkan pada era yang serba digital saat ini dan di masa yang akan datang. Penggunaan aplikasi digital sebagai salah satu produk dari teknologi telah banyak digunakan baik di bidang kesehatan maupun pendidikan, dan saling terkait satu sama lain dimana saling melengkapi. Penginformasian masalah kesehatan pasti membutuhkan bidang pendidikan untuk menyampaikannya, demikian pula sebaliknya pendidikan tidak dapat berjalan lancar bila lingkungannya tidak sehat. Dengan demikian peran teknologi pada kedua bidang tersebut menjadi sangat penting. Berdasarkan hal-hal yang telah tersebut di atas, maka perlu adanya suatu pengetahuan kepada siswa-siswa sekolah terutama di sekolah dasar dan yang sederajat tentang Perilaku Hidup Bersih dan Sehat (PHBS). Siswa-siswa selain diberi pengetahuan juga perlu diberikan pendampingan pada saat mempraktikkan materi PHBS tersebut serta memasukkan peran teknologi dalam bentuk aplikasi digital agar pembelajaran dapat lebih menyenangkan dan efektif, dimana sebelumnya perlu diadakan sosialisasi dan pelatihan terlebih dahulu mengenai penggunaan aplikasi tersebut kepada para guru. Berdasar atas alasan-alasan yang dikemukakan tersebut, maka kali ini tim berinisiatif untuk mengadakan kegiatan berupa Pengabdian Kepada Masyarakat dengan tema Pendampingan Pola Hidup Bersih dan Sehat (PHBS) pada Siswa dengan Sosialisasi Aplikasi Digital, dengan lokasi yang telah ditentukan yaitu di MI Miftahul Hidayah, sehingga PHBS dapat menjadi kebiasaan siswa di kesehariannya dan dapat menularkan kebiasaan baik tersebut ke lingkungannya.
Peningkatan Kesadaran Kanker Usus pada Siswa SMP Ibu Kartini melalui Aplikasi Mobile Dewi, Ika Novita; Utomo, Danang Wahyu; Salam, Abu; Luthfiarta, Ardytha; Octaviani, Dhita Aulia; Dzaki, Azmi Abiyyu; Haresta, Alif Agsakli
ABDIMASKU : JURNAL PENGABDIAN MASYARAKAT Vol 8, No 2 (2025): MEI 2025
Publisher : LPPM UNIVERSITAS DIAN NUSWANTORO

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.62411/ja.v8i2.2987

Abstract

Kanker usus merupakan salah satu penyakit yang dapat dicegah melalui kesadaran kesehatan yang baik dan deteksi dini. Namun, kurangnya edukasi kesehatan di kalangan remaja menjadi tantangan dalam upaya pencegahan penyakit ini. Program kemitraan masyarakat (PKM) ini bertujuan untuk memberikan edukasi dan meningkatkan pemahaman siswa SMP Ibu Kartini Semarang tentang pola hidup bersih dan sehat (PHBS), faktor risiko, serta deteksi dini kanker usus. Selain itu, program ini juga memperkenalkan aplikasi mobile Oncodoc sebagai sarana untuk deteksi dini kanker secara mandiri. Kegiatan dalam program ini mencakup sesi edukasi kesehatan, demonstrasi penggunaan aplikasi mobile Oncodoc, serta evaluasi pemahaman peserta melalui diskusi dan tanya jawab. Hasil evaluasi menunjukkan bahwa setelah mengikuti program, pemahaman siswa mengenai faktor risiko kanker usus, pentingnya pola hidup sehat, dan manfaat deteksi dini meningkat secara signifikan. Siswa juga menunjukkan ketertarikan terhadap penggunaan teknologi sebagai alat bantu dalam menjaga kesehatan. Temuan dari program ini mengindikasikan bahwa edukasi berbasis teknologi dapat menjadi metode yang efektif dalam meningkatkan kesadaran kesehatan remaja. Oleh karena itu, program serupa direkomendasikan untuk diperluas ke sekolah lain dengan tambahan sesi tindak lanjut guna memastikan pemanfaatan aplikasi secara optimal dalam mendukung edukasi kesehatan
Optimization of Biobert Model for Medical Entity Recognition Through Bilstm and CNN-Char Integration Salam, Abu; Prinantyo, Gilang Djati
INOVTEK Polbeng - Seri Informatika Vol. 10 No. 2 (2025): July
Publisher : P3M Politeknik Negeri Bengkalis

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35314/bypwas91

Abstract

Biomedical Named Entity Recognition (NER) is essential for extracting structured information from medical texts. However, existing models like BioBERT face challenges when dealing with complex biomedical entities, particularly those with intricate morphological structures. This research enhances the BioBERT model by integrating BiLSTM and character-level CNN (CNN-Char), aiming to improve the recognition of Chemical and Disease entities. The proposed models were trained and evaluated on the BC5CDR dataset sourced from the official BioCreative V CDR Corpus. The modified model achieved an F1-score of 0.8678, indicating a significant improvement compared to the standard BioBERT model, which scored 0.8597. This increase is primarily observed in the recognition of complex entity structures, particularly those requiring character-level representation. Despite this improvement, the model is limited to Chemical and Disease entities and may not generalise to other biomedical categories. Future work should focus on expanding the entity types and exploring other model architectures, such as SciBERT or BioALBERT, to further enhance performance
The Development of a Deployment System Architecture for a Flask-Based Chatbot Using an LSTM NLP Model for Customer Service Question & Answer Mukti, David Ramantya; Salam, Abu
Journal of Applied Informatics and Computing Vol. 9 No. 3 (2025): June 2025
Publisher : Politeknik Negeri Batam

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30871/jaic.v9i3.9305

Abstract

In the past two decades, the rapid growth of e-commerce has significantly transformed global business practices. E-commerce has not only revolutionized the retail industry but also positively impacted businesses and consumer experiences. The ease of online shopping enables users to select products at more competitive prices. Amidst these changes, human-computer interactions have increasingly evolved toward natural conversations through Natural Language Processing (NLP). This study aims to develop a chatbot utilizing Long Short-Term Memory (LSTM) technology as a medium for e-commerce customer service. The dataset used for chatbot development is in JSON format and consists of 580 entries spanning 38 categories or classes. Data processing involves several preprocessing stages, including case folding, lemmatization, tokenization, and padding. The model is developed using a bidirectional LSTM and GRU architecture, followed by regularization techniques to enhance performance. Evaluation results show the model achieves 90% training accuracy and 63% validation accuracy with an F1-score of 62%. While there are indications of overfitting, the observed differences are not statistically significant, indicating the model remains capable of providing reliable responses. Additionally, the model is integrated into a Flask-based web application with an interactive interface to facilitate user access. This study demonstrates that LSTM is effective in addressing vanishing gradient problems.
A Comparative Performance of SMOTE, ADASYN and Random Oversampling in Machine Learning Models on Prostate Cancer Dataset Putra, Aditya Herdiansyah; Salam, Abu
Journal of Applied Informatics and Computing Vol. 9 No. 3 (2025): June 2025
Publisher : Politeknik Negeri Batam

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30871/jaic.v9i3.9308

Abstract

Class imbalance in medical datasets, including prostate cancer, can affect the performance of machine learning models in detecting minority cases. This study compares three oversampling techniques - SMOTE, ADASYN, and Random Oversampling - to address data imbalance in prostate cancer classification. These techniques are applied to Random Forest (RF), Decision Tree (DT), and LightGBM (LGBM), which are evaluated using accuracy, precision, recall, F1-score, and ROC-AUC. In improving the reliability of the evaluation, K-Fold Cross Validation was used to reduce the risk of overfitting and ensure stable results. The findings show that oversampling techniques improve model performance compared to the baseline. Random Oversampling has the best performance for Random Forest with accuracy 0.85, recall 0.888, precision 0.873, F1-score 0.879, and ROC-AUC 0.838. SMOTE produced the highest Decision Tree performance with accuracy 0.80, recall 0.838, precision 0.843, F1-score 0.839, and ROC-AUC 0.788. ADASYN provided the most improvement for LightGBM, achieving accuracy 0.89, recall 0.919, precision 0.913, F1-score 0.913, and ROC-AUC 0.879. These results confirm that the oversampling method improves prostate cancer classification performance by tailoring the resampling technique to the model characteristics.
Comparison of Data Normalization Techniques on KNN Classification Performance for Pima Indians Diabetes Dataset Dimas Pratama, Yohanes; Salam, Abu
Journal of Applied Informatics and Computing Vol. 9 No. 3 (2025): June 2025
Publisher : Politeknik Negeri Batam

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30871/jaic.v9i3.9353

Abstract

This study analyzes the comparison of data normalization techniques in the K-Nearest Neighbors (KNN) model for diabetes classification using the Pima Indians Diabetes dataset. The three normalization techniques evaluated are Min-Max Scaling, Z-Score Scaling, and Decimal Scaling. After preprocessing, such as handling missing values and removing duplicates, as well as feature selection using the Random Forest method, the features removed include SkinThickness, Insulin, Pregnancies, and BloodPressure. The evaluation was carried out using accuracy, precision, recall, F1-Score, specificity, and ROC AUC metrics. The results show that Min-Max Scaling provides a significant improvement in all metrics, with the highest accuracy of 0.8117 and ROC AUC of 0.8050. Z-Score Scaling provides good results, but not as good as Min-Max Scaling. Decimal Scaling shows the lowest performance. Statistical tests using Paired T-Test show significant differences between Min-Max Scaling and without normalization on all metrics (P-Value <0.05), while Z-Score Scaling and Decimal Scaling are only significant on some metrics, with P-Values of 0.08363 and 0.43839 respectively for accuracy and ROC AUC. Overall, Min-Max Scaling proved to be the best normalization method for improving KNN performance in diabetes classification.
Enchancing Enhancing Medical Named Entity Recognition with Ensemble Voting of BERT-Based Models on BC5CDR Maulana, Fadhli Faqih; Salam, Abu
Journal of Applied Informatics and Computing Vol. 9 No. 3 (2025): June 2025
Publisher : Politeknik Negeri Batam

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30871/jaic.v9i3.9549

Abstract

The rapid development in biotechnology and medical research has resulted in a large amount of scientific literature containing critical information about various medical entities. However, the primary challenge in managing this data is the vast volume of unstructured text, which requires Natural Language Processing (NLP) techniques for automatic information extraction. One of the main applications in NLP is Named Entity Recognition (NER), which aims to identify important entities in the text, such as disease names, drugs, and proteins. This study aims to enhance the performance of medical Named Entity Recognition (NER) by applying ensemble Voting to three BERT-based models: BioBERT, TinyBERT, and ClinicalBERT. The results show that the ensemble voting technique provides the best performance in medical entity extraction, with improvements in precision (0.9494), recall (0.9483), and F1-score (0.9488) compared to individual models, especially when handling less common medical entities. This approach is expected to contribute to the development of automated systems for analyzing and searching information in medical literature.
Enhancing Liver Cirrhosis Staging Accuracy using Optuna-Optimized TabNet Arifin, Muhammad Farhan; Dewi, Ika Novita; Salam, Abu; Utomo, Danang Wahyu; Rakasiwi, Sindhu
Journal of Applied Informatics and Computing Vol. 9 No. 5 (2025): October 2025
Publisher : Politeknik Negeri Batam

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30871/jaic.v9i5.11011

Abstract

Liver cirrhosis is a progressive chronic disease whose early detection poses a clinical challenge, making accurate severity staging crucial for patient management. This research proposes and evaluates a TabNet deep learning model, specifically designed for tabular data, to address this challenge. In the initial evaluation, a baseline TabNet model with its default configuration achieved a baseline accuracy of 65.11% on a public clinical dataset. To enhance performance, hyperparameter optimization using Optuna was implemented, which successfully increased the accuracy significantly to 70.37%, with precision, recall, and F1-score metrics each reaching 70%. The model's discriminative ability was also validated as reliable in multiclass classification through AUC metric evaluation. In addition to accuracy improvements, the model's interpretability was validated through the identification of key predictive features such as Prothrombin and Hepatomegaly, which align with clinical indicators. This study demonstrates that Optuna-optimized TabNet is an effective and interpretable approach, possessing significant potential for integration into clinical decision support systems to support a more precise diagnosis of liver cirrhosis.