Suprapto Suprapto
Universitas Gadjah Mada

Published : 2 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 2 Documents
Search

Hepatitis Diagnosis Using Case-Based Reasoning with Gradient Descent as Feature Weighting Method Yufika Sari Bagi; Suprapto Suprapto
Journal of Information Systems Engineering and Business Intelligence Vol. 4 No. 1 (2018): April
Publisher : Universitas Airlangga

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (227.072 KB) | DOI: 10.20473/jisebi.4.1.25-31

Abstract

Retrieval is one of the stages in case-based reasoning system which find a solution to new problem or case by measuring the similarity between the new case and old cases in the case base. Some of the similarity measurement techniques are involving feature weights that show the importance of the feature in a case. Feature weights can be obtained from a domain expert or by using a feature weighting method either locally or globally. Gradient descent is the feature weighting method which computes global weights for each feature. This research implemented gradient descent to obtain feature weights in case-based reasoning for hepatitis diagnosis and the similarity measurement using weighted Euclidean distance. There are four variations number of case base and test data that used in this research, those are: the first variation using 50% of data as case base and 50% as test data second variation using 60% of data as case base and 40% as test data, third variation using 70% of data as case base and 30% as test data and fourth variation using 80% of data as case base and 20% as test data. For each variation, using 4 kinds of scenario to mark the test data those are in first scenario the test data mark at the end of data, in second scenario the test data mark at the begin of data, in third scenario the test data mark half at the begin and half at the end of data and in the fourth scenario the test data mark in the middle of data. The result of this research showed that the accuracy of the system reaches 100% at scenario 1 in variation 4. Overall of all four variations and four kinds of scenario, the average accuracy of the system was 77.55%, average recall of system was 69.74%, and the average of precision was 78.39%. In addition, the level of accuracy was also influenced by the number of case base and the scenario of case selection for the case base. This is because more cases in the case base, the chances of a system to finding similar cases will be more.
IndoBERTSkill: pretrained domain-specific language model for recognition Indonesian skill Meilany Nonsi Tentua; Suprapto Suprapto; Afiahayati Afiahayati
International Journal of Advances in Intelligent Informatics Vol 12, No 2 (2026): May 2026
Publisher : Universitas Ahmad Dahlan

Show Abstract | Download Original | Original Source | Check in Google Scholar

Abstract

The pretrained language model in Indonesian is already available for natural language processing tasks. However, this pre-trained model has been trained on Indonesian text, which has a different structure from the job description. Due to this, the pre-trained language model effectiveness for skill recognition purposes. IndoBERTSkill is a novel pre trained domain-specific language model that recognizes Indonesian language skills. It is built on the Bidirectional Encoder Representations from Transformers (BERT) architecture. IndoBERTSkill was trained on an extensive collection of Indonesian language texts from the Indonesian Wikipedia, the English Wikipedia, and the Indonesian Job Description from the job portal. IndoBERTSkill's performance was evaluated through two main approaches: (1) language modeling via Masked Language Model (MLM) prediction, and (2) fine-tuning on a custom annotated dataset (NERSkill) for Named Entity Recognition (NER) tasks. The fine-tuning process involved training a classification layer on top of the IndoBERTSkill model using BIO tagging to identify hard skills, soft skills, and technology entities. Similarly, the skill recognition model derived from IndoBERTSkill exhibits the highest F1-Score among various pre-trained language models, precisely at 87%, thus demonstrating robustness and strong generalizability for skill entity recognition in Indonesian job descriptions. IndoBERTSkill provides valuable resources for developing Indonesian natural language processing applications that require skills introduction. This could increase the accuracy and efficiency of skills recognition across various domains, including job matching, education, and training.