Royida A. Ibrahem Alhayali
University of Diyala

Published : 4 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 4 Documents
Search

An adaptive clustering and classification algorithm for Twitter data streaming in Apache Spark Raed A. Hasan; Royida A. Ibrahem Alhayali; Nashwan Dheyaa Zaki; Ahmed Hussien Ali
TELKOMNIKA (Telecommunication Computing Electronics and Control) Vol 17, No 6: December 2019
Publisher : Universitas Ahmad Dahlan

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.12928/telkomnika.v17i6.11711

Abstract

On-going big data from social networks sites alike Twitter or Facebook has been an entrancing hotspot for investigation by researchers in current decades as a result of various aspects including up-to-date-ness, accessibility and popularity; however anyway there may be a trade off in accuracy. Moreover, clustering of twitter data has caught the attention of researchers. As such, an algorithm which can cluster data within a lesser computational time, especially for data streaming is needed. The presented adaptive clustering and classification algorithm is used for data streaming in Apache spark to overcome the existing problems is processed in two phases. In the first phase, the input pre-processed twitter data is viably clustered utilizing an Improved Fuzzy C-means clustering and the proposed clustering is additionally improved by an Adaptive Particle swarm optimization (PSO) algorithm. Further the clustered data streaming is assessed utilizing spark engine. In the second phase, the input pre-processed Higgs data is classified utilizing the modified support vector machine (MSVM) classifier with grid search optimization. At long last the optimized information is assessed in spark engine and the assessed esteem is utilized to discover an accomplished confusion matrix. The proposed work is utilizing Twitter dataset and Higgs dataset for the data streaming in Apache Spark. The computational examinations exhibit the superiority ofpresented approach comparing with the existing methods in terms of precision, recall, F-score, convergence, ROC curve and accuracy.
An effective classification approach for big data with parallel generalized Hebbian algorithm Ahmed Hussein Ali; Royida A. Ibrahem Alhayali; Mostafa Abdulghafoor Mohammed; Tole Sutikno
Bulletin of Electrical Engineering and Informatics Vol 10, No 6: December 2021
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/eei.v10i6.3135

Abstract

Advancements in information technology is contributing to the excessive rate of big data generation recently. Big data refers to datasets that are huge in volume and consumes much time and space to process and transmit using the available resources. Big data also covers data with unstructured and structured formats. Many agencies are currently subscribing to research on big data analytics owing to the failure of the existing data processing techniques to handle the rate at which big data is generated. This paper presents an efficient classification and reduction technique for big data based on parallel generalized Hebbian algorithm (GHA) which is one of the commonly used principal component analysis (PCA) neural network (NN) learning algorithms. The new method proposed in this study was compared to the existing methods to demonstrate its capabilities in reducing the dimensionality of big data. The proposed method in this paper is implemented using Spark Radoop platform.
Optimized machine learning algorithm for intrusion detection Royida A. Ibrahem Alhayali; Mohammad Aljanabi; Ahmed Hussein Ali; Mostafa Abdulghfoor Mohammed; Tole Sutikno
Indonesian Journal of Electrical Engineering and Computer Science Vol 24, No 1: October 2021
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijeecs.v24.i1.pp590-599

Abstract

Intrusion detection is mainly achieved by using optimization algorithms. The need for optimization algorithms for intrusion detection is necessitated by the increasing number of features in audit data, as well as the performance failure of the human-based smart intrusion detection system (IDS) in terms of their prolonged training time and classification accuracy. This article presents an improved intrusion detection technique for binary classification. The proposal is a combination of different optimizers, including Rao optimization algorithm, extreme learning machine (ELM), support vector machine (SVM), and logistic regression (LR) (for feature selection & weighting), as well as a hybrid Rao-SVM algorithm with supervised machine learning (ML) techniques for feature subset selection (FSS). The process of selecting the least number of features without sacrificing the FSS accuracy was considered a multi-objective optimization problem. The algorithm-specific, parameter-less concept of the proposed Rao-SVM was also explored in this study. The KDDCup 99 and CICIDS 2017 were used as the intrusion dataset for the experiments, where significant improvements were noted with the new Rao-SVM compared to the other algorithms. Rao-SVM presented better results than many existing works by reaching 100% accuracy for KDDCup 99 dataset and 97% for CICIDS dataset.
Efficient method for breast cancer classification based on ensemble hoffeding tree and naïve Bayes Royida A. Ibrahem Alhayali; Munef Abdullah Ahmed; Yasmin Makki Mohialden; Ahmed H. Ali
Indonesian Journal of Electrical Engineering and Computer Science Vol 18, No 2: May 2020
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijeecs.v18.i2.pp1074-1080

Abstract

The most dangerous type of cancer suffered by women above 35 years of age is breast cancer. Breast Cancer datasets are normally characterized by missing data, high dimensionality, non-normal distribution, class imbalance, noisy, and inconsistency. Classification is a machine learning (ML) process which has a significant role in the prediction of outcomes, and one of the outstanding supervised classification methods in data mining is Naives Bayess Classification (NBC). Naïve Bayes Classifications is good at predicting outcomes and often outperforms other classifications techniques. Ones of the reasons behind this strong performance of NBC is the assumptions of conditional Independences among the initial parameters and the predictors. However, this assumption is not always true and can cause loss of accuracy. Hoeffding trees assume the suitability of using a small sample to select the optimal splitting attribute. This study proposes a new method for improving accuracy of classification of breast cancer datasets. The method proposes the use of Hoeffding trees for normal classification and naïve Bayes for reducing data dimensionality.