Claim Missing Document
Check
Articles

Found 6 Documents
Search

Unveiling Criminal Activity: a Social Media Mining Approach to Crime Prediction Armoogum, Sheeba; Dewi, Deshinta Arrova; Armoogum, Vinaye; Melanie, Nicolas; Kurniawan, Tri Basuki
Journal of Applied Data Sciences Vol 5, No 3: SEPTEMBER 2024
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v5i3.350

Abstract

Social media platforms have become breeding grounds for abusive comments, necessitating the use of machine learning to detect harmful content. This study aims to predict abusive comments within a Mauritian context, focusing specifically on comments written in Mauritian Kreol, a language with limited natural language processing tools. The objective was to build and evaluate four machine learning models—Decision Tree, Random Forest, Naïve Bayes, and Support Vector Machine (SVM)—to accurately classify comments as abusive or non-abusive. The models were trained and tested using k-fold cross-validation, and the Decision Tree model outperformed others with 100% precision and recall, while Random Forest followed with 99% accuracy. Naïve Bayes and SVM, although achieving 100% precision, had lower recall rates of 35% and 16%, respectively, due to imbalanced data in the training set. Pre-processing steps, including stop-word removal and a custom Kreol spell checker, were key in enhancing model performance. The study provides a novel contribution by applying machine learning in a Mauritian context, demonstrating the potential of AI in detecting abusive language in underrepresented languages. Despite limitations such as the absence of a Kreol lemmatization tool and incomplete coverage of Kreol spelling variations, the models show promise for wider application in social media crime detection. Future research could explore expanding this approach to other languages and domains of social media crimes.
Clustering the Unlabeled Data Using a Modified Cat Swarm Optimization Dewi, Deshinta Arrova; Kurniawan, Tri Basuki; Zakaria, Mohd Zaki; Armoogum, Sheeba
Journal of Applied Data Sciences Vol 5, No 3: SEPTEMBER 2024
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v5i3.349

Abstract

This paper presents a modified version of the Cat Swarm Optimization (CSO) algorithm aimed at addressing the limitations of traditional clustering methods in handling complex, high-dimensional datasets. The primary objective of this research is to improve clustering accuracy and stability by eliminating the mixture ratio (MR), setting the counts of dimensions to change (CDC) to 100%, and incorporating a new search equation in the tracing mode of the CSO algorithm. To evaluate the performance of the modified algorithm, five classic datasets from the UCI Machine Learning Repository—namely Iris, Cancer, Glass, Wine, and Contraceptive Method Choice (CMC)—were used. The proposed algorithm was compared against K-Means and the original CSO. Performance metrics such as intra-cluster distance, standard deviation, and F- measure were used to assess the quality of clustering. The results demonstrated that the modified CSO consistently outperformed the competing algorithms. For example, on the Iris dataset, the modified CSO achieved a best intra-cluster distance of 96.78 and an F-measure of 0.786, compared to 97.12 and 0.781 for K-Means. Similarly, for the Wine dataset, the modified CSO reached a best intra-cluster distance of 16399, surpassing K-Means which recorded 16768. In conclusion, the modifications introduced to the CSO algorithm significantly enhance its clustering performance across diverse datasets, producing tighter and more accurate clusters with improved stability. These findings suggest that the modified CSO is a robust and effective tool for data clustering tasks, particularly in high-dimensional spaces. Future work will focus on dynamic parameter tuning and testing the scalability of the algorithm on larger and more complex datasets.
Breast Cancer Prediction Using Metrics-Based Classification Armoogum, Sheeba; Dewi, Deshinta Arrova; Kezhilen, Motean; Trinawarman, Dedi
Journal of Applied Data Sciences Vol 5, No 3: SEPTEMBER 2024
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v5i3.351

Abstract

Breast cancer remains the most prevalent form of cancer among women, with rising mortality rates worldwide. Early detection and accurate classification are crucial for improving patient outcomes, but manual detection methods are often time-consuming, complex, and prone to inaccuracies. This study aims to develop a machine learning (ML)-based desktop application to automate the detection and classification of breast cancer, thereby improving the efficiency and accuracy of diagnosis. Various ML algorithms, including Random Forest, Decision Tree, Support Vector Machine, Logistic Regression, Gaussian Naive Bayes, and K-nearest Neighbors, were employed to build classification models. The Wisconsin Diagnostic Breast Cancer (WDBC) dataset was used, and pre-processing techniques such as data cleaning, over-sampling, and feature selection were applied to optimize model performance. Experimental results demonstrate that the Random Forest classifier outperformed the other models, achieving an accuracy of 95.54%, precision of 96.72%, recall (sensitivity) of 95.16%, specificity of 96%, and an F1-score of 95.93%. These results highlight the potential of ML techniques in enhancing breast cancer diagnosis by offering a more reliable and efficient classification process. Future work could focus on improving feature selection techniques and applying the model to more diverse datasets for broader applicability.
A neural machine translation system for Kreol Repiblik Moris and English Pudaruth, Sameerchand; Armoogum, Sheeba; Kumar Betchoo, Nirmal; Sukhoo, Aneerav; Gooria, Vandanah; Peerally, Abdallah; Zafar Khodabocus, Mohammad
IAES International Journal of Artificial Intelligence (IJ-AI) Vol 13, No 4: December 2024
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijai.v13.i4.pp4976-4987

Abstract

Although Google Translate is a widely used machine translation service that supports 133 languages, it does not incorporate support for the Kreol Repiblik Moris (KRM) language. Addressing this limitation, the current research focuses on enhancing the accuracy and fluency of machine translation between KRM and English through natural language processing and deep neural machine translation techniques. In this study, a machine translation system using a transformer model trained with a dataset of 50,000 parallel corpora has been developed. The model was evaluated using manual translations and the bilingual evaluation understudy (BLEU) score. A score of 31.46 for translating from KRM to English and 28.15 for translating from English to KRM was achieved. To our knowledge, these are the highest BLEU scores for translation between these two languages. This is due to utilising the largest dataset and extensive atomic words from the KRM dictionary. This successful interdisciplinary funded project led to the setting up of a free online translation service and a smartphone app for Mauritian citizens and tourists.
Breast Cancer Prediction Using Transfer Learning-Based Classification Model Armoogum, Sheeba; Motean, Kezhilen; Dewi, Deshinta Arrova; Kurniawan, Tri Basuki; Kijsomporn, Jureerat
Emerging Science Journal Vol 8, No 6 (2024): December
Publisher : Ital Publication

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.28991/ESJ-2024-08-06-014

Abstract

Breast cancer is currently the most prevalent type of cancer in women, with a growing number of fatalities worldwide. Different imaging methods like mammography, computed tomography, Magnetic Resonance Imaging, ultrasound, and biopsies assist in detecting breast cancer. Recent developments in deep learning have revolutionized breast cancer pathology by facilitating accurate image categorization. This study introduces a novel approach to enhance detection and classification using the Convolutional Neural Network Deep Learning method and Transfer Learning to create a high-speed, accurate image classification model. The model is trained on pre-processed data subjected to thorough analysis and augmentation to ensure the quality of inputs. The experimental results from the Breast Ultrasound Image dataset indicate that our model, with a 0.1 test size ratio, outperforms its counterparts. It achieved an accuracy of 90.12%, with a loss of 0.2641, validation accuracy of 90.15%, and validation loss of 0.31, evidencing its superior classification capability. This research introduces an innovative approach to the automated diagnosis of breast cancer. By combining CNN, Transfer Learning, and data augmentation, we have developed a desktop application that expedites the classification process and significantly improves accuracy. This advancement represents a key development in machine learning applications for breast cancer prognostics and diagnostics. Doi: 10.28991/ESJ-2024-08-06-014 Full Text: PDF
A Comprehensive Review of Cyber Hygiene Practices in the Workplace for Enhanced Digital Security Armoogum, Sheeba; Armoogum, Vinaye; Chandra, Anurag; Dewi, Deshinta Arrova; Kurniawan, Tri Basuki; Bappoo, Soodeshna; Mohd Salikon, Mohd Zaki; Alanda, Alde
JOIV : International Journal on Informatics Visualization Vol 9, No 1 (2025)
Publisher : Society of Visual Informatics

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.62527/joiv.9.1.3787

Abstract

In today's digital age, cybercrime is increasing at an alarming rate, and it has become more critical than ever for organizations to prioritize adopting best practices in cyber hygiene to safeguard their personnel and resources from cyberattacks. As personal hygiene keeps one clean and healthy, cyber hygiene combines behaviors to enhance data privacy. This paper aims to explore the common cyber-attacks currently faced by organizations and how the different practices associated with good cyber hygiene can be used to mitigate those attacks. This paper also emphasizes the need for organizations to adopt good cyber hygiene techniques and, therefore, provides the top 10 effective cyber hygiene measures for organizations seeking to enhance their cybersecurity posture. To better evaluate the cyber hygiene techniques, a systematic literature approach was used, assessing the different models of cyber hygiene, thus distinguishing between good and bad cyber hygiene techniques and what are the cyber-attacks associated with bad cyber hygiene that can eventually affect any organization. Based on the case study and surveys done by the researchers, it has been deduced that good cyber hygiene techniques bring positive behavior among employees, thus contributing to a more secure organization. More importantly, it is the responsibility of both the organization and the employees to practice good cyber hygiene techniques. Suppose organizations fail to enforce good cyber hygiene techniques, such as a lack of security awareness programs. In that case, employees may have the misconception that it is not their responsibility to contribute to their security and that of the organization, which consequently opens doors to various cyber-attacks. There have not been many research papers on cyber hygiene, particularly when it comes to its application in the workplace, which is a fundamental aspect of our everyday life. This paper focuses on the cyber hygiene techniques that any small to larger organization should consider. It also highlights the existing challenges associated with the implementation of good cyber hygiene techniques and offers potential solutions to address them.