Mustafa Man
Universiti Malaysia Terengganu

Published : 10 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 3 Documents
Search
Journal : International Journal of Electrical and Computer Engineering

Analysis study on R-Eclat algorithm in infrequent itemsets mining Mustafa Man; Julaily Aida Jusoh; Syarilla Iryani Ahmad Saany; Wan Aezwani Wan Abu Bakar; Mohd Hafizuddin Ibrahim
International Journal of Electrical and Computer Engineering (IJECE) Vol 9, No 6: December 2019
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (653.478 KB) | DOI: 10.11591/ijece.v9i6.pp5446-5453

Abstract

There are rising interests in developing techniques for data mining. One of the important subfield in data mining is itemset mining, which consists of discovering appealing and useful patterns in transaction databases. In a big data environment, the problem of mining infrequent itemsets becomes more complicated when dealing with a huge dataset. Infrequent itemsets mining may provide valuable information in the knowledge mining process. The current basic algorithms that widely implemented in infrequent itemset mining are derived from Apriori and FP-Growth. The use of Eclat-based in infrequent itemset mining has not yet been extensively exploited. This paper addresses the discovery of infrequent itemsets mining from the transactional database based on Eclat algorithm. To address this issue, the minimum support measure is defined as a weighted frequency of occurrence of an itemsets in the analysed data. Preliminary experimental results illustrate that Eclat-based algorithm is more efficient in mining dense data as compared to sparse data.
Postdiffset Algorithm in Rare Pattern: An Implementation via Benchmark Case Study Mustafa Man; Wan Aezwani Wan Abu Bakar; Masita Masila Abd Jalil; Julalily Aida Jusoh
International Journal of Electrical and Computer Engineering (IJECE) Vol 8, No 6: December 2018
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (316.046 KB)

Abstract

Frequent and infrequent itemset mining are trending in data mining techniques. The pattern of Association Rule (AR) generated will help decision maker or business policy maker to project for the next intended items across a wide variety of applications. While frequent itemsets are dealing with items that are most purchased or used, infrequent items are those items that are infrequently occur or also called rare items. The AR mining still remains as one of the most prominent areas in data mining that aims to extract interesting correlations, patterns, association or casual structures among set of items in the transaction databases or other data repositories. The design of database structure in association rules mining algorithms are based upon horizontal or vertical data formats. These two data formats have been widely discussed by showing few examples of algorithm of each data formats. The efforts on horizontal format suffers in huge candidate generation and multiple database scans which resulting in higher memory consumptions. To overcome the issue, the solutions on vertical approaches are proposed. One of the established algorithms in vertical data format is Eclat.ECLAT or Equivalence Class Transformation algorithm is one example solution that lies in vertical database format. Because of its, fast intersection‟, in this paper, we analyze the fundamental Eclat and Eclatvariants such asdiffsetand sortdiffset. In response to vertical data format and as a continuity to Eclat extension, we propose a postdiffset algorithm as a new member in Eclat variants that use tidset format in the first looping and diffset in the later looping. In this paper, we present the performance of Postdiffset algorithm prior to implementation in mining of infrequent or rare itemset.Postdiffset algorithm outperforms 23% and 84% to diffset and sortdiffset in mushroom and 94% and 99% to diffset and sortdiffset in retail dataset.
A performance of comparative study for semi-structured web data extraction model Ily Amalina Ahmad Sabri; Mustafa Man
International Journal of Electrical and Computer Engineering (IJECE) Vol 9, No 6: December 2019
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (663.643 KB) | DOI: 10.11591/ijece.v9i6.pp5463-5470

Abstract

The extraction of information from multi-sources of web is an essential yet complicated step for data analysis in multiple domains. In this paper, we present a data extraction model based on visual segmentation, DOM tree and JSON approach which is known as Wrapper Extraction of Image using DOM and JSON (WEIDJ) for extracting semi-structured data from biodiversity web. The large number of information from multiple sources of web which is image’s information will be extracted using three different approach; Document Object Model (DOM), Wrapper image using Hybrid DOM and JSON (WHDJ) and Wrapper Extraction of Image using DOM and JSON (WEIDJ). Experiments were conducted on several biodiversity website. The experiment results show that WEIDJ approach promising results with respect to time analysis values. WEIDJ wrapper has successfully extracted greater than 100 images of data from the multi-source web biodiversity of over 15 different websites.