Claim Missing Document
Check
Articles

Found 2 Documents
Search

Comparison of Latent Semantic Analysis (LSA) and Doc2Vec Algorithms of Thesis Similarity Detection Rita Wahyuni Arifin; Mardi Yudhi Putra; Dwi Ismiyana Putri
PIKSEL : Penelitian Ilmu Komputer Sistem Embedded and Logic Vol. 12 No. 2 (2024): September 2024
Publisher : LPPM Universitas Islam 45 Bekasi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.33558/piksel.v12i2.9954

Abstract

This study aims to develop a system for detecting similarities in thesis titles and content to prevent plagiarism and support student originality. The high level of similarity in final projects is a significant concern in academic environments. Two text vectorization methods, Latent Semantic Analysis (LSA) and Doc2Vec, were compared to measure document similarity. Results showed that LSA achieved a very high cosine similarity (99.94%) due to dimensionality reduction that preserved semantic correlations. In contrast, Doc2Vec produced lower similarity scores, with 7.17% for PV-DM and 39.07% for PV-DBOW, indicating richer text representations. This study adopted the CRISP-DM model, which includes Business Understanding, Data Understanding, Data Preparation, Modelling, and Evaluation. The model is expected to strengthen academic integrity and encourage valuable scientific contributions.
Enhancing Transformer Performance through Contextual Labeling: A Case Study on Student Mental Health Prediction Mardi Yudhi Putra; Dwi Ismiyana Putri; Rika Apriani; Renaldi Triharsono; Dewi Mufadilah
PIKSEL : Penelitian Ilmu Komputer Sistem Embedded and Logic Vol. 14 No. 1 (2026): March 2026
Publisher : LPPM Universitas Islam 45 Bekasi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.33558/piksel.v14i1.11800

Abstract

 Early identification of stress and depression among university students is essential to support timely psychological intervention, yet traditional counseling methods often rely on manual, self-initiated reporting that may overlook students experiencing emotional distress. This study aimed to develop a text-based mental-health detection framework using transformer models supported by contextual labeling to analyze student-generated social-media content. The research was conducted through three stages: problem exploration with the Student Affairs Division, data collection from questionnaires and 993 social-media text entries, and comprehensive data preprocessing involving cleaning, normalization, deduplication, and lexicon-based weak labeling. The cleaned dataset was used to fine-tune two transformer architectures—RoBERTa for sequence classification and T5 for text-to-text classification—and to construct a majority-vote ensemble. Model performance was evaluated using accuracy, precision, recall, F1-score, and confusion matrices. The results showed that the T5 model achieved the most balanced performance across all categories, particularly in distinguishing neutral and stress expressions, while RoBERTa and the ensemble exhibited strong prediction bias toward a single class. The findings demonstrated that contextual preprocessing combined with transformer-based modeling effectively supported automated detection of student emotional states. This study concluded that transformer models, especially T5 with contextual labeling, offered a promising foundation for developing early-warning systems that can be integrated into university counseling services and further enhanced through expanded datasets, expert-validated annotations, and explainable-AI components.