Claim Missing Document
Check
Articles

Found 2 Documents
Search

A Systematic Literature Review on Machine Learning Algorithms for the Detection of Social Media Fake News in Africa Chukwuere, Joshua Ebere; Montshiwa, Tlhalitshi Volition
Journal of Information System and Informatics Vol 7 No 2 (2025): June
Publisher : Universitas Bina Darma

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.51519/journalisi.v7i2.1103

Abstract

Fake news has been around in history before social media emerged. Social media platforms enable the creation, processing, and sharing of various kinds of content and information on the Internet. While the mediums of information and content shared across social media platforms are hard for users to authenticate, if users are tracking fake information or fake content, it can harm individuals, society, or the world. Fake news is increasingly becoming a worrisome issue, especially in Africa, because it's difficult to identify and stop the distribution of fake news. Due to languages and diversity, it is difficult for humans to understand and subsequently identify fake news on social media platforms, so high-level technological strategies, such as machine learning (ML), would be able to tell if the content is false material. As such, this study sought to identify effective ML classifiers to detect fake news on social media platforms, and the systematic literature review followed the PRISMA standard. The study identified 14 effective ML classifiers to manage fake news on social media platforms, including Random Forest, Naive Bayes, and others. Four research questions guided the study focused on the effectiveness of the classifiers, their applicability for detecting different forms of false news, the features of the dataset size and features, and the metrics that were created to assess the metrics. A conceptual framework known as the Information Behavioral Driven Social Cognitive Model (IBDSCM) was proposed in a bid to affect the fake news detection on social media platforms. Overall, this study establishes a contribution to understanding the ML algorithms for detecting false news in Africa and allows for a conceptual base for future studies.
Impact of Sample Size on the Robustness of Machine Learning Algorithms for Detecting Loan Defaults Using Imbalanced Data Kobone, Boitumelo Tryphina; Montshiwa, Tlhalitshi Volition
Journal of Applied Data Sciences Vol 6, No 3: September 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i3.713

Abstract

This study aimed to assess the impact of sample size on the robustness of five machine learning classifiers: Support Vector Machine (SVM), Random Forest (RF), Naïve Bayes (NB), Decision Trees (DT), and K-Nearest Neighbour (K-NN). Although there are data-balancing techniques that aid in addressing data imbalance, they have some limitations which are discussed in this paper. The current study continues the trend in the application of these five ML classifiers for credit default detection, but it makes a contribution by examining whether sample size increment can better their performance when they are trained using a different imbalanced loan default dataset which has not been the focus of previous studies, although most ML algorithms are known to perform well when trained with large datasets. The study used a secondary loan default imbalanced dataset from Kaggle.com, where 85% of participants made loan payments and 15% defaulted. Stratified random sampling was used to select different sample sizes starting with 2% of the total observations, followed by 5%, then 10% up to 90% of the dataset, with the dependent variable being the stratum. The study found no consistent change in the classification metrics with the change in sample size, but RF and DT achieved 100% performance regardless of sample size and are therefore recommended as the most robust to data imbalance in loan default detection. The average classification metrics for NB and K-NN ranged from 72% to 92%, and SVM produced the lowest averages which were between 69% and 75%. NB, K-NN and SVM yielded poor sensitivity rates of 0% to 53%, indicating poor loan payments prediction but they had sensitivity scores in range of 84% to 86%, indicating good loan default classification. Future studies should consider other sampling methods, deep and hybrid learning methods with comparison to RF and DT.