Adni Navastara, Dini
Unknown Affiliation

Published : 2 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 2 Documents
Search

Transfer Learning Menggunakan LoRA+ pada Llama 3.2 untuk Percakapan Bahasa Indonesia Kautsar, Faiz; Wicaksono, Farhan; Hafidz, Abdan; Purwitasari, Diana; Suciati, Nanik; Adni Navastara, Dini; Gurat Adillion, Ilham
Techno.Com Vol. 24 No. 2 (2025): Mei 2025
Publisher : LPPM Universitas Dian Nuswantoro

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.62411/tc.v24i2.12508

Abstract

Penelitian ini mengeksplorasi penerapan dari metode Parameter-Efficient Finetuning (PEFT) Low-Rank Adaptation+ (LoRA+) pada transfer learning model Llama 3.2 1B, sebuah model bahasa besar. Seiring bertambahnya ukuran model bahasa, finetuning yang dilakukan secara konvensional dalam transfer learning semakin tidak fisibel untuk dilakukan tanpa menggunakan komputasi skala besar. Untuk menangani hal tersebut, dapat dilakukan finetuning pada beberapa komponen saja, menggunakan komputasi yang relatif minimal berbanding dengan finetuning konvensional, metode-metode yang menerapkan prinsip ini disebut juga sebagai PEFT. Penelitian menguji efektifitas metode PEFT, yakni LoRA+, pada transfer learning model bahasa besar terhadap domain baru, yakni bahasa Indonesia, menggunakan metrik BLEU, ROUGE, serta Weighted F1. Hasil penelitian menunjukkan bahwa penerapan LoRA+ menghasilkan performa kompetitif dan unggul terhadap baseline dalam kemampuan berbahasa Indonesia, dengan peningkatan 112% pada skor BLEU dan 21.7% pada skor ROUGE-L, dengan standar deviasi yang relatif rendah sebesar 3.72 dan 0.00075. Meskipun terjadi penurunan pada skor Weighted F1 sebesar 13% yang disebabkan oleh domain shift, model menunjukkan kemampuan transfer lintas-bahasa yang baik. Kata kunci: Finetuning, Model Bahasa Besar, Parameter-Efficient Finetuning, Low-Rank Adaptation, Transfer Learning
Multi-label Aspect Dangerous Speech Classification Using Keyword-Driven Ensemble Classifier on Imbalanced Data Findawati, Yulian; Budi Raharjo, Agus; Adni Navastara, Dini; Yonathan, Vincent; Yatestha, Anak Agung; Purwitasari, Diana
JOIV : International Journal on Informatics Visualization Vol 9, No 4 (2025)
Publisher : Society of Visual Informatics

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.62527/joiv.9.4.3129

Abstract

This study aims to detect various aspects of dangerous speech on social media, particularly Twitter, which has the potential to incite violence and increase prejudice against specific communities. The research dataset includes tweets containing dangerous speech related to the Indonesian government from 2019 to 2022. Researchers manually labeled the data based on seven aspects of hazardous speech, including social and historical context, dehumanization, accusations in the mirror, threats against women/children, questioning in-group loyalty, and threats against groups. The study employs a multi-label classification method to handle these aspects, which appear simultaneously in a single text. The main challenges include data imbalance, ambiguity, and the informal language frequently appearing in tweets. This study introduces a Keyword-Driven Ensemble Classifier (KDEC), a new ensemble model that leverages the strengths of SVC, Logistic Regression, IndoBERTweet, and specific keyword lists for each label. Researchers designed KDEC based on the best results from machine learning and deep learning methods tested in this study. The research team tested the model on small and large datasets, conducting trials involving seven and four-label classifications. The results show that KDEC, with label reduction and keyword support, effectively addresses data imbalance, resolves label overlap, and achieves 92% accuracy for seven-label classification and 88% for four-label classification. The findings of this research are highly relevant for hate speech analysis across various platforms and languages, particularly in understanding context and conveyed messages. Additionally, this study provides valuable insights into managing harmful content in online government-related discussions. This method identifies dangerous speech on a larger scale and supports data-driven social media content regulation decision-making.