Claim Missing Document
Check
Articles

Found 2 Documents
Search

Prediction of Cyberbullying in Social Media on Twitter Using Logistic Regression Prayudani, Santi; Adha, Lilis Tiara; Ariyani, Tika; Lubis, Arif Ridho
Journal of Applied Informatics and Computing Vol. 9 No. 4 (2025): August 2025
Publisher : Politeknik Negeri Batam

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30871/jaic.v9i4.9842

Abstract

As cases of cyberbullying on social media increase, there is a need for efficient measures to detect the vice. This research aims to establish the application of machine learning algorithms in analyzing text on social media to determine potentially harmful comments using logistic regression. The first and most important research question of this study is to assess the extent to which the model is capable of correctly identifying the comments that contain features of cyberbullying and those that do not. The data set included comments from different social media sites and was preprocessed before further analysis was conducted on it. Exploratory Data Analysis was applied in the study to establish relationships and textual features with bullying behavior. As with any other model, after training and testing the model, the results were analyzed using parameters like precision, precision, gain, and F1 statistics. The outcomes of this study revealed that the use of logistic regression models can give a fairly satisfactory level of accuracy in identifying cyberbullying. In light of this, this study underscores the need to use machine learning algorithms to minimize negative actions in cyberspace.
A COMPARATIVE STUDY OF PIPELINE-VALIDATED MACHINE LEARNING CLASSIFIERS FOR PERMISSION-BASED ANDROID MALWARE DETECTION Lubis, Arif Ridho; Wulandari, Dewi; Adha, Lilis Tiara; Ariyani, Tika; Lase, Yuyun; Lubis, Fahdi Saidi
BAREKENG: Jurnal Ilmu Matematika dan Terapan Vol 20 No 2 (2026): BAREKENG: Journal of Mathematics and Its Application
Publisher : PATTIMURA UNIVERSITY

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30598/barekengvol20iss2pp1675-1692

Abstract

The growing prevalence of Android malware distributed through third-party APK sideloading poses a significant security threat to users and developers. This study aims to evaluate the effectiveness of three machine learning algorithms—Logistic Regression (LR), Random Forests (RF), and Gradient Boosting Machine (GBM)—for static Android malware detection based on permission features. The experiment employs the publicly available Android Malware Prediction Dataset (Kaggle, accessed 2025), containing 4,464 application samples with 328 binary permission attributes. A leakage-free CRISP-DM workflow was implemented, integrating data cleaning, automated feature selection via SelectKBest (Mutual Information), and hyperparameter optimisation using GridSearchCV with stratified 5-fold cross-validation. Results on the unseen hold-out test set show that GBM achieved the best performance, with 96.05% accuracy and 0.9924 ROC-AUC, outperforming LR and RF. In addition, GBM exhibited superior probability calibration (Brier Score = 0.0344) and interpretability, as confirmed through SHAP analysis. The ablation study further validated that optimal model performance saturates at 30–40 selected features. This research contributes a reproducible and pipeline-validated comparative framework for static Android malware detection, addressing prior studies’ limitations regarding feature selection bias and data leakage. Nevertheless, the study is limited by its reliance on static permission features and the absence of dynamic behavioural data, which may restrict generalisation to evolving malware families.