Claim Missing Document
Check
Articles

Found 26 Documents
Search

Toddler Nutritional Status Classification Using C4.5 and Particle Swarm Optimization Nazir, Alwis; Akhyar, Amany; Yusra, Yusra; Budianita, Elvia
Scientific Journal of Informatics Vol 9, No 1 (2022): May 2022
Publisher : Universitas Negeri Semarang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15294/sji.v9i1.33158

Abstract

Abstract. Purpose: This research was conducted to create a classification model in the form of the most optimal decision tree. Optimal in this case is the combination of parameters used that will produce the highest accuracy compared to other parameter combinations. From this best model, it will be used to predict the nutritional status class for the new data.Methods/Study design/approach: The dataset used is from Nutritional Status Monitoring in 2017 in Riau Province, Indonesia. From the dataset, the Knowledge Discovery in Database (KDD) stages were carried out to build several classification models in the form of decision trees. The decision tree that has the highest accuracy will then be selected to predict the class for the new data. Predictions for new data (unclassified data) will be made in a web-based system.Result/Findings: Particle Swarm Optimization is used to find optimal parameters. Before PSO is used, there are 213 parameters in the dataset that can be used to do classification. However, using many such parameters is time-consuming. After PSO is used, the optimal parameters found are the combination of 4 parameters, which can produce the most optimal decision tree. The 4 chosen parameters are gender, age (in months), height, and the way to measure the height (either stand up or lie down). The most optimal decision tree has an accuracy of 94.49%. From the most optimal decision tree, a web-based system was built to predict the class for new data (unclassified data).Novelty/Originality/Value: Particle Swarm Optimization (PSO) is a method that can help to select the most optimal parameters, or in other words produce the highest classification accuracy. The combination of parameters selected has also been confirmed by the nutritionist. The prediction system has been declared feasible to be used by nutritionists through the User Acceptance Test (UAT).
Implementasi Algoritma FP-Growth untuk Menemukan Pola Keterkaitan Antara Matakuliah Pemrograman dan Matakuliah Matematika Putri. P, Zurneli Kurnia; Iskandar, Iwan; Nazir, Alwis
Jurnal CoreIT: Jurnal Hasil Penelitian Ilmu Komputer dan Teknologi Informasi Vol 7, No 2 (2021): Desember 2021
Publisher : Fakultas Sains dan Teknologi, Universitas Islam Negeri Sultan Syarif Kasim Riau

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (423.192 KB) | DOI: 10.24014/coreit.v7i2.15351

Abstract

The specification of programming skills is one of the focuses of learning in the Informatics Engineering study program which requires students to understand and get good grades in all courses related to programming. The subject that is considered to have a relationship with the programming field is the Mathematics course. Efforts to determine the correlation between programming courses and mathematics courses through one of the association algorithms in data mining, namely the FP-Growth algorithm. FP-Growth was chosen because it has a faster data pattern execution rate than the a priori algorithm. The final stage of KDD produces 1227 data which is then processed using the FPGrowth algorithm. Tests with a minimum support value of 0.5 and minimum confidence of 0.7 show the same number of patterns between applications built with the SPMF application of 52250 patterns. The highest support value of 51% and the highest confidence value of 98% and the highest lift ratio value of 1.1941 in the combination of itemset patterns indicate that if students pass programming courses, then mathematics courses can also pass or vice versa.
Data Warehouse Design For Sales Transactions on CV. Sumber Tirta Anugerah Syaputra, Muhammad Dwiky; Nazir, Alwis; Gusti, Siska Kurnia; Sanjaya, Suwanto; Syafria, Fadhilah
Jurnal CoreIT: Jurnal Hasil Penelitian Ilmu Komputer dan Teknologi Informasi Vol 8, No 2 (2022): December 2022
Publisher : Fakultas Sains dan Teknologi, Universitas Islam Negeri Sultan Syarif Kasim Riau

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (644.133 KB) | DOI: 10.24014/coreit.v8i2.19800

Abstract

Many data warehouses are implemented in companies engaged in retail, CV. Sumber Tirta Anugerah is one of the paint product retail companies that has not implemented it yet. As time goes by, the sales transaction data is getting more and more difficult to process because it is still stored in Microsoft Excel. This is a serious problem in utilizing historical data to assist in making a decision. It is difficult to store sales data because the data is quite large and a lot. Based on the above problems, a data warehouse design is needed for sales transaction data. This data warehouse design uses Kimball's nine-steps method and star schema. To perform the ETL process (extract, transform, and load) using Pentaho software. In this data warehouse design, Tableau software is used to visualize the processed data into a graph and dashboard report. The result of this research is a data warehouse design using nine steps and a star schema which gets a transformation response time of 4048 MS. 
Analisis Pola Asosiasi Data Transaksi Penjualan Minuman Menggunakan Algoritma FP-Growth dan Eclat Najmi, Risna Lailatun; Irsyad, Muhammad; Insani, Fitri; Nazir, Alwis; ., Pizaini
Building of Informatics, Technology and Science (BITS) Vol 5 No 1 (2023): June 2023
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v5i1.3592

Abstract

Every day transaction activities between companies and consumers continue to be carried out. This makes transaction data more and more and accumulate. This transaction data can be processed into more useful information using technology. Data mining is a technology that can work on a collection of transaction data into information that can be taken by companies as decision makers. The association rule method is used as a method to see the relationship between items in a transaction data. To analyze transaction data, researchers used the FP-Growth and Eclat algorithms. There are three stages of association in this study which are distinguished from the confidence value. The results in the first stage have a minimum confidence value of 0.4, the FP-Growth algorithm produces 41 association pattern rules, while the Eclat algorithm produces 32 association pattern rules. Then in the second stage the minimum trust value is 0.5, the FP-Growth algorithm produces 40 association pattern rules, for the Eclat algorithm it produces 32 association pattern rules. In the third stage, the minimum trust value is 0.6, the FP-Growth algorithm generates 32 association pattern rules, while the Eclat algorithm generates 30 association pattern rules. The results of the association pattern rules show that the Eclat algorithm is more efficient in determining the association pattern rules than the Fp-Growth algorithm
Sistem Klasifikasi Penyakit Jantung Menggunakan Teknik Pendekatan SMOTE Pada Algoritma Modified K-Nearest Neighbor Novitasari, Fitria; Haerani, Elin; Nazir, Alwis; Jasril, Jasril; Insani, Fitri
Building of Informatics, Technology and Science (BITS) Vol 5 No 1 (2023): June 2023
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v5i1.3610

Abstract

The heart is a vital organ that plays a crucial role in pumping oxygenated blood and nutrients throughout the body. Heart disease refers to damage to the heart that can occur in various forms, caused by infections or congenital abnormalities. The World Health Organization (WHO) reports nearly 17.9 million deaths each year due to heart disease. In Indonesia, the prevalence of heart disease is around 1.5%, meaning that in 2018, approximately 15 out of 1,000 people, or nearly 2,784,060 individuals, were affected by this disease, according to the Basic Health Research data (Riskesdas) 2018. Many people have limited knowledge about heart health, leading to a lack of awareness of their heart conditions. This can be attributed to a lack of understanding regarding the importance of medical checkups related to heart health. Modified K-Nearest Neighbors (MKNN) is one of the data mining methods applied for classifying the risk of heart disease. The research utilized data obtained from the UCI dataset repository, which consists of 918 records with 12 attributes. To balance the imbalanced dataset with minority classes, the Synthetic Minority Over-sampling Technique (SMOTE) approach was used to generate new synthetic samples from the minority class. The objective of developing a web-based system for heart disease classification is to assist the public in assessing their risk of heart disease as early as possible, enabling them to take preventive actions sooner. The accuracy results of the MKNN algorithm with a 90:10 ratio are 80.37%, while with the MKNN+SMOTE approach, the accuracy increased to 84.00%. The use of the SMOTE approach improved the accuracy of low-performing data.
Klasifikasi Sentimen Terhadap Pengangkatan Kaesang Sebagai Ketua Umum Partai PSI Menggunakan Metode Support Vector Machine .Safrizal, Safrizal; Agustian, Surya; Nazir, Alwis; Yusra, Yusra
Building of Informatics, Technology and Science (BITS) Vol 6 No 1 (2024): June 2024
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v6i1.5340

Abstract

The appointment of Kaesang Pangarep as the Chairman of the Indonesian Solidarity Party (PSI) has sparked various responses on social media, particularly on Twitter. This research aims to classify public sentiment regarding the appointment using the Support Vector Machine (SVM) algorithm with FastText feature representation. The data used for classification involves a small training dataset. The text preprocessing process includes cleaning, case folding, tokenizing, normalization, stopword removal, and stemming. FastText word embedding is used to convert words into vectors, and an SVM model with Grid Search is used for parameter tuning to obtain the optimal model. The use of external datasets to expand the initially limited training dataset enhances data representation and improves the model's performance in sentiment classification. The Covid dataset was expanded by adding 100, 200, and 300 tweets for each negative, positive, and neutral label. From the experiments conducted, the best accuracy on the test data was found in experiment ID C2 with an F1-Score of 53.59% and an accuracy of 62.73%. In experiment ID C3 with the same dataset, the F1-Score was 50.46% and the accuracy was 60.46%. Finally, in experiment ID C7 with the same dataset, the F1-Score was 47.19% and the accuracy was 53.09%.
Klasifikasi Status Stunting Balita Dengan Metode Support Vector Machine Berbasis Web Adzhima, Fauzan; Budianita, Elvia; Nazir, Alwis; Syafria, Fadhilah
Jurnal Inovtek Polbeng Seri Informatika Vol 8, No 2 (2023)
Publisher : P3M Politeknik Negeri Bengkalis

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35314/isi.v8i2.3641

Abstract

Orang tua harus memperhatikan anak mereka saat balita, karena di usia tersebut mereka rentan terhadap berbagai gangguan pertumbuhan dan perkembangan, salah satunya stunting. Stunting adalah gangguan pertumbuhan dan perkembangan yang disebabkan oleh kekurangan gizi dan ditandai dengan tinggi badan yang tidak memenuhi kriteria pertumbuhan normal anak seusianya. Untuk mencegah stunting, tenaga kesehatan atau kader posyandu mengukur antropometri tubuh anak-anak di posyandu. Data hasil pengukuran tubuh anak diproses secara manual, sehingga ada kemungkinan besar kesalahan pemrosesan karena kesalahan manusia (human error). Dengan mempelajari pola data pengukuran, data mining dapat mengatasi masalah dalam proses pengolahan data pengukuran. SVM merupakan salah satu metode data mining yang umum dipakai untuk permasalahan klasifikasi dengan kelebihannya yang dapat bekerja dengan menggunakan memori yang kecil serta dapat memisah data yang tidak dapat dipisahkan secara linier. Usia, jenis kelamin, Inisiasi Menyusui Dini (IMD), berat badan, dan tinggi badan adalah atribut yang digunakan untuk klasifikasi menggunakan algoritma SVM ini. Berdasarkan pengujian yang dilakukan, terdapat 1172 data dengan hasil rata-rata performa model terbaik menggunakan parameter γ = 0.01 dan akurasi 98.99%, sehingga model dapat digunakan untuk memprediksi data pengukuran baru secara akurat dan tindakan pencegahan stunting dapat segera dilakukan.
Analisa sentimen terhadap kenaikan bbm di twitter (x) menggunakan naive bayes classifier Muhammad Abdillah; Fikry, Muhammad; Yusra; Nazir, Alwis; Insani, Fitri
Computer Science and Information Technology Vol 5 No 1 (2024): Jurnal Computer Science and Information Technology (CoSciTech)
Publisher : Universitas Muhammadiyah Riau

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.37859/coscitech.v5i1.6954

Abstract

In early September 2022, there was a shock from the news of the rise in fuel prices. The government decided to increase the price of fuel due to the surge in world oil prices. PT Pertamina (Persero) officially raised the price of Fuel Oil (BBM) one-third of September 2022, at 2:30 PM WIB (Western Indonesia Time). Since the decision, it has sparked opinions from the public. Many people expressed their responses through the social media platform Twitter, both in positive and negative ways. This resulted in both positive and negative sentiments from the public. The data used consisted of 3,000 tweets with the keyword "FUEL PRICE INCREASE," collected from November 1, 2022, to December 1, 2022. This research utilized the Naive Bayes Classifier method, conducted with three comparisons using thresholds ranging from 0.001 to 0.007. The experiment was conducted with three types of data testing: opinion data, mixed data (opinion-non-opinion), and balanced data. Here are the test results: for opinion data, the highest accuracy obtained was 80% with a ratio of 90:10, for mixed data, the accuracy obtained was 67.7% with a ratio of 70:30, and for balanced data, the accuracy obtained was 63.6% with a ratio of 90:10.
Application of Data Mining for Ceramic Sales Data Association Using Apriori Algorithm Habibi, M. Ilham; Nazir, Alwis; Haerani, Elin; Budianita, Elvia
Knowbase : International Journal of Knowledge in Database Vol. 4 No. 2 (2024): December 2024
Publisher : Universitas Islam Negeri Sjech M. Djamil Djambek Bukittinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30983/knowbase.v5i2.8757

Abstract

This research is conducted to provide an understanding of consumer purchasing patterns at CV. Sukses Bersama by applying data mining using the association rules method and the Apriori algorithm to identify the relationships between one item that influences other items within a ceramic sales dataset at CV. Sukses Bersama. This information is expected to serve as a foundation for improving sales strategies, optimizing customer satisfaction, and expanding the company's market share. The Apriori algorithm is a popular algorithm implemented to identify association rules in data mining. The Apriori algorithm was chosen due to its ability to efficiently identify association rules and its good scalability in handling large datasets. This research begins with the collection of ceramic sales data, followed by data preprocessing to clean and prepare the data. The Apriori algorithm is then applied to discover the association rules, which generate two matrices: support and confidence, and the results are subsequently evaluated. This research was conducted using Google Colaboratory, a web application that is a cloud-based platform provided by Google to run Python code. The results of the study show that the Apriori algorithm can depict significant association structures between different ceramic brand types in the sales data of CV. Sukses Bersama. The calculation results show that the rule has the maximum support and confidence value, namely 67% support value and 84% confidence value in the rule "if you buy the DIAMD brand, you will buy the TOTAL brand"
Penerapan Algoritma FP-Growth dan K-Means Clustering dalam Analisis Pola Asosiasi Berdasarkan Segmentasi Pelanggan Hasibuan, Aldiansyah Pramudia; Insani, Fitri; Nazir, Alwis; Afrianty, Iis
Journal of Information System Research (JOSH) Vol 6 No 3 (2025): April 2025
Publisher : Forum Kerjasama Pendidikan Tinggi (FKPT)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/josh.v6i3.7112

Abstract

The pharmaceutical industry has experienced rapid growth, urging companies to leverage sales data effectively to enhance data-driven marketing strategies. However, utilizing sales data remains a challenge for XYZ company, a pharmaceutical distributor. This study aims to analyze customer purchasing patterns by applying the FP-Growth algorithm for association analysis, combined with customer segmentation using the K-Means algorithm based on RFM (Recency, Frequency, Monetary) analysis. The segmentation process resulted in four customer clusters: active and loyal customers (Cluster 1), passive customers (Cluster 2), less active customers (Cluster 3), and new customers (Cluster 4). FP-Growth analysis for each cluster revealed that Cluster 1 generated 10 significant association rules with a minimum support of 0.01 and confidence of 0.7, while Clusters 2, 3, and 4 produced 2, 3, and 4 association rules, respectively, with adjusted parameters. All rules showed a lift value > 1, indicating positive relationships between products. The findings of this study provide strategic insights for companies in designing data-driven marketing approaches, such as more targeted product offerings for loyal customers or retention strategies for passive customers, thereby optimizing sales and increasing profitability in each customer segment.