Claim Missing Document
Check
Articles

Found 1 Documents
Search
Journal : Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi)

BERT Model Fine-tuned for Scientific Document Classification and Recommendation Antariksa, Muhammad Deagama Surya; Sugiharto, Aris; Surarso, Bayu
Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) Vol 9 No 4 (2025): August 2025
Publisher : Ikatan Ahli Informatika Indonesia (IAII)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.29207/resti.v9i4.6789

Abstract

The increasing number of academic documents requires efficient and accurate classification and recommendation systems to assist in retrieving relevant information. This system is built using the "bert-base-uncased” model from Hugging Face, which has been fine-tuned to improve the classification accuracy and relevance of document recommendations. The dataset used consists of 2.000 academic documents in the field of computer science, with features including titles, abstracts, and keywords, which were combined into a single input for the model. Document similarity is measured using cosine similarity, resulting in recommendations based on semantic proximity. Unlike traditional approaches, which rely primarily on word frequency or surface-level matching, the proposed method leverages BERT’s contextual embeddings to capture deeper semantic meanings and relationships between documents. This allows for more accurate classification and more context-aware recommendations. Evaluation results show that the best model configuration (learning rate 3e-5, batch size 32, optimizer AdamW) achieved 89.5% training accuracy and an F1-score of 0.8947, while testing yielded 91% accuracy and 90% F1-score. The recommendation system consistently produced Precision@k values above 92% for k between 5 and 30, with Recall@k reaching 1.0 as k increased. These results indicate that the system not only performs reliably in classifying complex academic texts but also effectively recommends contextually relevant documents. This integrated approach shows strong potential for enhancing academic document retrieval and supports the development of semantically aware information management systems.