Claim Missing Document
Check
Articles

Found 14 Documents
Search

Academic expert finding using BERT pre-trained language model Ilma Alpha Mannix; Evi Yulianti
International Journal of Advances in Intelligent Informatics Vol 10, No 2 (2024): May 2024
Publisher : Universitas Ahmad Dahlan

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.26555/ijain.v10i2.1497

Abstract

Academic expert finding has numerous advantages, such as: finding paper-reviewers, research collaboration, enhancing knowledge transfer, etc. Especially, for research collaboration, researchers tend to seek collaborators who share similar backgrounds or with the same native languages. Despite its importance, academic expert findings remain relatively unexplored within the context of Indonesian language. Recent studies have primarily relied on static word embedding techniques such as Word2Vec to match documents with relevant expertise areas. However, Word2Vec is unable to capture the varying meanings of words in different contexts. To address this research gap, this study employs Bidirectional Encoder Representations from Transformers (BERT), a state-of-the-art contextual embedding model. This paper aims to examine the effectiveness of BERT on the task of academic expert finding. The proposed model in this research consists of three variations of BERT, namely IndoBERT (Indonesian BERT), mBERT (Multilingual BERT), and SciBERT (Scientific BERT), which will be compared to a static embedding model using Word2Vec. Two approaches were employed to rank experts using the BERT variations: feature-based and fine-tuning. We found that the IndoBERT model outperforms the baseline by 6–9% when utilizing the feature-based approach and shows an improvement of 10–18% with the fine-tuning approach. Our results proved that the fine-tuning approach performs better than the feature-based approach, with an improvement of 1–5%.  It concludes by using IndoBERT, this research has shown an improved effectiveness in the academic expert finding within the context of Indonesian language.
Analisis Tren Penjualan dan Prediksi Produk CV. Sentosa Menggunakan Regresi Linier Dona Marcelina; Indah Pratiwi Putri; Evi Yulianti; Agustina Heryati
JSAI (Journal Scientific and Applied Informatics) Vol 8 No 1 (2025): Januari
Publisher : Fakultas Teknik Universitas Muhammadiyah Bengkulu

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.36085/jsai.v8i1.7649

Abstract

This study analyzed sales trends and forecasted the sales of CV Sentosa's products, namely Ater 360 New (X1), Bon Bon (X2), Mini Peanut Crackers (X3), and Marie Susu Int (X4), during the period of January 2019 to August 2023. Monthly sales data were processed using exploratory data analysis (EDA) and linear regression to predict sales trends. The linear regression analysis results indicated that X2 and X3 experienced sales growth with a slope of m=0.01, representing an average increase of 0.01 units per month. Conversely, X4 showed a slight decline with m=−0.01, while X1 remained stable with m=−0.00, indicating minimal changes in sales volume. The accuracy evaluation of the predictions based on MAE, MSE, and RMSE showed that X2 had the best performance with MAE 0.14, MSE 0.03, and RMSE 0.19, followed by X1 and X3, which had similar prediction errors. Although X4 initially showed significant growth, its model exhibited higher prediction errors (MAE 0.17, MSE 0.04, RMSE 0.21). This study provides valuable insights for CV Sentosa's business strategies, highlighting X2 and X3 as promising products due to their consistent growth trends and accurate predictions. This research provides a strong foundation for CV Sentosa in formulating more effective marketing strategies and product development in the future
From Text to Truth: Leveraging IndoBERT and Machine Learning Models for Hoax Detection in Indonesian News Muhammad Yusuf Ridho; Evi Yulianti
Jurnal Ilmiah Teknik Elektro Komputer dan Informatika Vol. 10 No. 3 (2024): September
Publisher : Universitas Ahmad Dahlan

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.26555/jiteki.v10i3.29450

Abstract

In the era of technology and information exchange online content being deceitful poses a serious threat to public trust and social harmony on a global scale. Detective mechanisms to identify content are essential for safeguard the populace effectively. This study is dedicated to creating a machine learning system that can automatically spot deceptive content in Indonesian language by utilizing IndoBERT. A model specifically tailored for the intricacies of the Indonesian language. IndoBERT was selected due to its capacity to grasp the linguistic nuances present, in Indonesian text which are often challenging for other models built upon the BERT framework. The key focus of this study lies in conducting an assessment of the IndoBERT model in relation to other approaches used in past research for identifying fake news like CNN LSTM and various classification models such as Logistic Regression and Naïve Bayes among others. To address the issue of imbalanced data between valid labels in fake news detection tasks we employed the SMOTE oversampling technique, for data augmentation and balancing purposes. The dataset employed consists of Indonesian language news articles publicly available and categorized as either hoax or valid following assessment by three judges voting system. IndoBERT Large demonstrated performance by achieving an accuracy rate of 98% outperform the original datasets 92% when tested on the oversampled dataset. Utilizing the SMOTE oversampling technique aided in data balance and enhancing the models performance. These outcomes highlight IndoBERTs capabilities in detecting fake news and pave the way for its potential integration, into real world scenarios.
Expertise Retrieval Using Adjusted TF-IDF and Keyword Mapping to ACM Classification Terms Aini, Lyla Ruslana; Evi Yulianti
Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) Vol 9 No 3 (2025): June 2025
Publisher : Ikatan Ahli Informatika Indonesia (IAII)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.29207/resti.v9i3.6397

Abstract

In an era of collaboration, knowing someone's expertise is becoming increasingly necessary. Recognizing individuals' proficiency can be challenging because it requires considerable manual time. This study explores the expertise of lecturers from the Computer Science Department, Universitas Indonesia (Fasilkom UI), based on scientific publications. The data were obtained from the Sinta journal website’s scrapping process, which includes Scopus, Garuda, and Google Scholar data sources. The approach used was keyword extraction using the adjusted TF-IDF. The resulting keywords were then mapped to the ACM classification class using cosine similarity calculations with various embedding models, including BERT, BERT multilingual, FastText, XLM Roberta, and SBERT. The experimental results highlighted that combining the adjusted TF-IDF with mapping to the ACM classes using SBERT is a promising approach for gaining the best expertise. The use of abstract data has proved to be better than that of full-text data. Using title-abstract-EN data achieved a score of 0.49 for both the P@1 and NDCG@1 metrics, whereas the title-abstract-ENID data attained a score of 0.75 for both metrics P@1 and NDCG@1.