Mohammad Alaqtash
The British University in Dubai

Published : 3 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 3 Documents
Search

A Modified Overlapping Partitioning Clustering Algorithm for Categorical Data Clustering Mohammad Alaqtash; Moayad A.Fadhil; Ali F. Al-Azzawi
Bulletin of Electrical Engineering and Informatics Vol 7, No 1: March 2018
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (468.491 KB) | DOI: 10.11591/eei.v7i1.896

Abstract

Clustering is one of the important approaches for Clustering enables the grouping of unlabeled data by partitioning data into clusters with similar patterns. Over the past decades, many clustering algorithms have been developed for various clustering problems. An overlapping partitioning clustering (OPC) algorithm can only handle numerical data. Hence, novel clustering algorithms have been studied extensively to overcome this issue. By increasing the number of objects belonging to one cluster and distance between cluster centers, the study aimed to cluster the textual data type without losing the main functions. The proposed study herein included over twenty newsgroup dataset, which consisted of approximately 20000 textual documents. By introducing some modifications to the traditional algorithm, an acceptable level of homogeneity and completeness of clusters were generated. Modifications were performed on the pre-processing phase and data representation, along with the number methods which influence the primary function of the algorithm. Subsequently, the results were evaluated and compared with the k-means algorithm of the training and test datasets. The results indicated that the modified algorithm could successfully handle the categorical data and produce satisfactory clusters.
A Modified Overlapping Partitioning Clustering Algorithm for Categorical Data Clustering Mohammad Alaqtash; Moayad A.Fadhil; Ali F. Al-Azzawi
Bulletin of Electrical Engineering and Informatics Vol 7, No 1: March 2018
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (468.491 KB) | DOI: 10.11591/eei.v7i1.896

Abstract

Clustering is one of the important approaches for Clustering enables the grouping of unlabeled data by partitioning data into clusters with similar patterns. Over the past decades, many clustering algorithms have been developed for various clustering problems. An overlapping partitioning clustering (OPC) algorithm can only handle numerical data. Hence, novel clustering algorithms have been studied extensively to overcome this issue. By increasing the number of objects belonging to one cluster and distance between cluster centers, the study aimed to cluster the textual data type without losing the main functions. The proposed study herein included over twenty newsgroup dataset, which consisted of approximately 20000 textual documents. By introducing some modifications to the traditional algorithm, an acceptable level of homogeneity and completeness of clusters were generated. Modifications were performed on the pre-processing phase and data representation, along with the number methods which influence the primary function of the algorithm. Subsequently, the results were evaluated and compared with the k-means algorithm of the training and test datasets. The results indicated that the modified algorithm could successfully handle the categorical data and produce satisfactory clusters.
A Modified Overlapping Partitioning Clustering Algorithm for Categorical Data Clustering Mohammad Alaqtash; Moayad A.Fadhil; Ali F. Al-Azzawi
Bulletin of Electrical Engineering and Informatics Vol 7, No 1: March 2018
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (468.491 KB) | DOI: 10.11591/eei.v7i1.896

Abstract

Clustering is one of the important approaches for Clustering enables the grouping of unlabeled data by partitioning data into clusters with similar patterns. Over the past decades, many clustering algorithms have been developed for various clustering problems. An overlapping partitioning clustering (OPC) algorithm can only handle numerical data. Hence, novel clustering algorithms have been studied extensively to overcome this issue. By increasing the number of objects belonging to one cluster and distance between cluster centers, the study aimed to cluster the textual data type without losing the main functions. The proposed study herein included over twenty newsgroup dataset, which consisted of approximately 20000 textual documents. By introducing some modifications to the traditional algorithm, an acceptable level of homogeneity and completeness of clusters were generated. Modifications were performed on the pre-processing phase and data representation, along with the number methods which influence the primary function of the algorithm. Subsequently, the results were evaluated and compared with the k-means algorithm of the training and test datasets. The results indicated that the modified algorithm could successfully handle the categorical data and produce satisfactory clusters.