Garuda - Garba Rujukan Digital

Article Per Year (5 Year)

p-Index From 2021 - 2026

0.23

P-Index

This Author published in this journals

All Journal IAES International Journal of Artificial Intelligence (IJ-AI)

Bhuyar, Vrushali

Unknown Affiliation

Author-ID : 8697842

Computer Science & IT Engineering

Published : 1 Documents Claim Missing Document

Claim Missing Document

Articles

Enhancing plagiarism detection using data pre-processing and machine learning approach Bhuyar, Vrushali; N. Deshmukh, Sachin
IAES International Journal of Artificial Intelligence (IJ-AI) Vol 14, No 3: June 2025
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijai.v14.i3.pp1940-1950

Modern technology and the internet have enhanced academic information accessibility, but this has led to a rising global concern about plagiarism. Researchers are actively exploring machine learning as a promising solution for detection. This study underscores the importance of robust data preprocessing for optimal machine learning algorithm performance. Using a dataset of 67 research papers, big five factors (OCEAN), and plagiarism rates, the study employed machine learning to detect plagiarism. The training process involved exposing algorithms to an 80% training subset, followed by evaluating their performance on the remaining 20% in the testing phase, assessing generalization capabilities. For the random forest regressor, bagging regressor, gradient boosting regressor, XGB regressor, and AdaBoost regressor, corresponding root mean squared error (RMSE) are 9.48, 10.66, 11.79, 12.53, and 12.79, respectively. This research contributes novel insights to existing literature by introducing a plagiarism detection model that innovatively integrates outlier detection, normalization, missing value imputation, and feature selection. The unique aspect lies in the effective combination of feature selection and missing value imputation, surpassing previous benchmarks and optimizing precision and efficiency. The approach is metaphorically likened to assembling puzzle pieces, highlighting the distinctive methodology employed in enhancing the performance of the plagiarism detection model using data preprocessing.

Co-Authors N. Deshmukh, Sachin

Title

Found 1 Documents
Search

Abstract

Title Search

Found 1 Documents Search

Abstract

Title

Found 1 Documents
Search