Claim Missing Document
Check
Articles

Found 29 Documents
Search

Perbandingan Metode K-Means dan K-Medoids Untuk Clustering Jenis Kriminalitas Azizah, Nurul; Fauzi, Ahmad; Rohana, Tatang; Faisal, Sutan
Building of Informatics, Technology and Science (BITS) Vol 6 No 2 (2024): September 2024
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v6i2.5723

Abstract

Crime in Indonesia includes acts that violate the law, social norms and religion which cause economic and psychological losses as well as social tensions in society. Crimes such as theft, violence, fraud and drugs are often triggered by factors such as poverty and environmental conditions that support criminal behavior. This research needs to be carried out to overcome the complex and far-reaching crime problem in Indonesia, especially in Karawang Regency. With crimes such as theft, violence, fraud and drugs on the rise, often fueled by factors such as poverty and environmental conditions, a more effective approach is needed to understand and address these problems. This research uses data mining techniques, especially cluster analysis, to group types of crime. The aim is to identify existing crime patterns and understand the factors that influence their spread. Thus, the results of this research can help the authorities in developing more targeted crime prevention and handling strategies, so as to minimize the negative impact of crime in the area. Apart from that, this research also contributes to increasing knowledge regarding the most effective methods for analyzing crime data, which can be applied in other areas with similar problems. The results of the research show that the K-Means algorithm is more effective than K-Medoids in handling data variability, with a Silhouette Coefficient value of 0.482 and a Davies Bouldin Index of 0.915. It is hoped that the implementation of this algorithm will make it easier to identify and handle crimes in the area.
Perbandingan Algoritma Apriori dan Algoritma FP-Growth dalam Menentukan Pola Penjualan Pupuk Rachmawati, Dhea; Cahyana, Yana; Awal, Elsa Elvira; Faisal, Sutan
Jurnal RESISTOR (Rekayasa Sistem Komputer) Vol. 7 No. 1 (2024): Jurnal RESISTOR Edisi April 2024
Publisher : Prahasta Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.31598/jurnalresistor.v7i1.1527

Abstract

Sistem informasi sangatlah penting pada era ini, dengan mengetahui data kita dapat membuat strategi pada suatu bisnis. contohnya kebutuhan pupuk di setiap daerah tentunya berbeda-beda, kita sebagai distributor bisnis pupuk harus mengetahui produk penjualan tertinggi hingga terendah pada setiap daerah. Oleh karena itu dengan memanfaatkan metode data mining yaitu teknik penggalian informasi baru dari kumpulan data yang bertujuan untuk mengetahui pola pembelian konsumen dengan meningkatkan penjualan produk, perusahaan penjualan perlu memikirkan berbagai strategi untuk mencapai hal tersebut dengan menggunakan perbandingan algoritma apriori dan algoritma fp-growth dalam data penjualan pupuk pada tahun 2022 di PT. Pupuk Kujang. Hasil penelitian ini, pada kedua algoritma menghasilkan Support dengan nilai tertinggi 59% dan Confidence dengan nilai tertinggi 100%, namun dari hasil aturan asosiasi algoritma apriori menghasilkan 136 aturan dan algoritma fp-growth menghasilkan 156 aturan. Dengan demikian, algoritma fp-growth dapat dikatakan mempunyai kinerja yang lebih baik dalam menghasilkan aturan asosiasi jika dibandingkan dengan Algoritma Apriori. Dalam penelitian ini juga memanfaatkan Association Rules seperti Cross-Selling dan Up-Selling. Pada asosiasi ini, bisnis dapat menerapkan strategi penjualan silang yang efektif, menawarkan produk tambahan atau peningkatan yang relevan kepada pelanggan, sehingga dapat meningkatkan pendapatan pada penjualan pupuk di PT. Pupuk Kujang. Kata kunci: Bisnis, Association Rule, Algoritma Apriori, Algoritma Fp-Growth.
COMPARISON OF DIABETES DISEASE CLASSIFICATION MODELS USING LOGISTIC REGRESSION AND RANDOM FOREST ALGORITHMS nabila, putri; Mutoi Siregar, Amril; Faisal, Sutan; Pratama, Adi Rizky
Faktor Exacta Vol 17, No 3 (2024)
Publisher : LPPM

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30998/faktorexacta.v17i3.24388

Abstract

Diabetes is a lifelong chronic disease that disrupts blood sugar regulation. Diabetes is a life-threatening condition that, if left untreated, can lead to death and other health problems. Several medical tests, including the glycated hemoglobin (A1C) test, blood sugar test, oral glucose tolerance test, and fasting blood sugar test, can be used to detect diabetes. According to statistics, high glucose levels are one of the problems associated with diabetes. This study aims to categorize patients into diabetic and non-diabetic groups using specific diagnostic metrics included in the dataset. 1500 patient records with 9 attributes and 2 classes were used by the researchers. The study used machine learning techniques, including Logistic Regression and Random Forest, along with Confusion Matrix and Receiver Operating Characteristics (ROC) assessment. The Random Forest method produced results of 97% accuracy, 97% precision, 100% recall, and 98% f1-score, indicating that the accuracy level seems good but can still be improved. Based on the accuracy findings, Random Forest is the most effective strategy of Logistic Regression.
Klasifikasi Penyakit Serangan Jantung Menggunakan Metode Machine Learning K-Nearest Neighbors (KNN) dan Support Vector Machine (SVM) Arif, Siti Novianti Nuraini; Siregar, Amril Mutoi; Faisal, Sutan; Juwita, Ayu Ratna
JURNAL MEDIA INFORMATIKA BUDIDARMA Vol 8, No 3 (2024): Juli 2024
Publisher : Universitas Budi Darma

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30865/mib.v8i3.7844

Abstract

Cardiovascular disease (CVD) is a general term for disorders related to the heart, coronary arteries, and blood vessels. These diseases are most commonly caused by blocked blood vessels, either due to fat buildup or internal bleeding. According to the WHO, each year, cardiovascular diseases account for 32% of all deaths, which translates to about 17.9 million people annually. The numerous factors causing CVD make it challenging for doctors to diagnose patients who are at low or higher risk of heart attacks. A machine learning model is needed for the early recognition of heart attack symptoms. Supervised learning models such as KNN and SVM were used in previous studies without feature selection, with datasets obtained from Kaggle. PCA was applied to reduce data dimensions to smaller variables. With the use of confusion matrix and ROC curve evaluations, the accuracy results of the previous KNN model improved from 83.6% to 90.16%. The SVM model also saw an accuracy increase from 85.7% to 86.88%. The use of PCA feature selection demonstrated an improvement in accuracy in the study. The KNN model, with a higher accuracy rate of 90.16%, is better for classifying individuals as normal or diagnosed with a heart attack.
OPTIMAL STUDY OF REAL-ESTATE PRICE PREDICTION MODELS USING MACHINE LEARNING Maulana, Ikhsan; Siregar, Amril Mutoi; Lestari, Santi Arum Puspita; Faisal, Sutan
Jurnal Teknik Informatika (Jutif) Vol. 5 No. 4 (2024): JUTIF Volume 5, Number 4, August 2024
Publisher : Informatika, Universitas Jenderal Soedirman

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.52436/1.jutif.2024.5.4.2565

Abstract

Everyone wants a place to live, especially close to work, shopping centers, easy transportation, low crime rates and others. Pricing must also pay attention to external factors, not just the house. Determining this price is sometimes difficult for some people. Therefore, the aim of this research is to predict real-estate prices by taking these factors into account. Prediction results are very useful for sellers who have difficulty determining prices and also for prospective buyers who are confused when making financial plans to buy a house in the desired neighborhood. The dataset used in this research was obtained from Kaggle and consists of 506 samples with 14 attributes. Several machine learning algorithms, such as Extra Trees (ET), Support Vector Regression (SVR), Random Forest (RF), eXtreme Gradient Boosting (XGB), Gradient Boosting Machine (GBM), Light Gradient Boosting Machine (LGBM), and CatBoost, used to predict real-estate prices. This research uses Principal Component Analysis (PCA) for feature selection techniques in data sets after the preprocessing phase and before model building. The highest accuracy model obtained is CatBoost with GridSearchCV, this model has been cross validated so there is very little chance of overfitting when given new data. The SVR model with a poly kernel uses a Principal Component (PC) of 10 and GridSearchCV gets an R2 Score of 0.87, a very large number close to the score of CatBoost with GridSearchCV.
IMPLEMENTATION OF DIABETES PREDICTION MODEL USING RANDOM FOREST ALGORITHM, K-NEAREST NEIGHBOR, AND LOGISTIC REGRESSION Pratama, Rio; Siregar, Amril Mutoi; Lestari, Santi Arum Puspita; Faisal, Sutan
Jurnal Teknik Informatika (Jutif) Vol. 5 No. 4 (2024): JUTIF Volume 5, Number 4, August 2024
Publisher : Informatika, Universitas Jenderal Soedirman

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.52436/1.jutif.2024.5.4.2593

Abstract

Diabetes is a serious metabolic disease that can cause various health complications. With more than 537 million people worldwide living with diabetes in 2021, early detection is crucial to preventing further complications. This research aims to predict the risk of diabetes using machine learning algorithms, namely Random Forest (RF), K-Nearest Neighbor (KNN), and Logistic Regression (LR), with the diabetes dataset from UCI. Previous research has explored a variety of algorithms and techniques, with results varying in accuracy. This research uses a dataset from Kaggle which consists of 768 data with 8 parameters, which are processed through pre-processing and data normalization techniques. The model was evaluated using metrics such as accuracy, confusion matrix, and ROC-AUC. The results showed that Logistic Regression had the best performance with 77% accuracy and AUC 0.83, compared to KNN (75% accuracy, AUC 0.81) and Random Forest ( 74% accuracy, AUC 0.81). These findings emphasize the importance of appropriate algorithm selection and good data pre-processing in diabetes risk prediction. This study concludes that Logistic Regression is the most effective method for predicting diabetes risk in the dataset used.
Peramalan Tren Musiman Jumlah Mahasiswa Baru Dengan Triple Exponential Smoothing Multiplicative Wicaksana, Yusuf Eka; Faisal, Sutan; Munzi, Gugy Guztaman
Syntax : Jurnal Informatika Vol. 13 No. 02 (2024): Oktober 2024
Publisher : Universitas Singaperbangsa Karawang

Show Abstract | Download Original | Original Source | Check in Google Scholar

Abstract

Pengetahuan jumlah mahasiswa baru dapat membantu dalam perencanaan sumber daya perguruan tinggi, dan optimisasi strategi pemasaran dan rekrutmen. Penelitian sebelumnya belum dapat mendeteksi data yang bersifat musiman, sehingga menggunakan triple exponential smoothing multiplicative untuk mengotimalisasi peramalan yang bersifat tren dan musiman. Dengan menggunakan pemulusan α = 0,1; β = 0,4; γ = 0,8; menghasilkan RMSE sebesar 3,32 sehingga dapat menjadi acuan untuk meramal jumlah mahasiswa baru Universitas Buana Perjuangan Karawang.
Implementasi Algoritma Convolutional Neural Network dan YOLOV8 Untuk Klasifikasi Ras Kucing Adinata, Abdul Rohim; Rohana, Tatang; Baihaqi, Kiki Ahmad; Faisal, Sutan
Building of Informatics, Technology and Science (BITS) Vol 6 No 3 (2024): December 2024
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v6i3.5913

Abstract

The cat with the scientific name Felis catus is a very popular pet and is often kept in various parts of the world. There are many types or breeds of cats, each of which has its own characteristics and characteristics, such as style, body shape, fur and color. However, because of the many breeds and the uniqueness of each breed, it is often difficult for ordinary people to differentiate between the types of cat breeds that exist. Therefore, technology is needed to identify and differentiate cat breeds. By comparing the Convolutional Neural Network (CNN) and YOLOV8 methods, this research aims to develop a cat breed classification model. This research uses data from six different cat breeds, namely Bengal, Bombay, Himalayan, Local, Persian and Sphynx. There are 1,200 images in all, with 200 images for each race. Before the data is used for training with the CNN and YOLOV8 methods, a preprocessing stage is carried out which includes resize and rescale for the CNN method, while for YOLOV8 a data labeling process is carried out. There are two parts to the dataset: 20% validation data and 80% training data. The training process is carried out with the same parameters for each model, namely a learning rate of 0.001, batch size of 15, and 100 epochs. From the test results with the confusion matrix, the YOLOV8 model shows the best performance with an accuracy value of 99%, precision 96.1%, recall 98.4%, and F1-score 97.2%.
Perbandingan Algoritma K-Means dan K-Medoids untuk Clustering Pada Transaksi Penjualan Minimarket Alganiu, Ajeng Shalwa; Juwita, Ayu Ratna; Rahmat, Rahmat; Faisal, Sutan
Journal of Computer System and Informatics (JoSYC) Vol 6 No 1 (2024): November 2024
Publisher : Forum Kerjasama Pendidikan Tinggi (FKPT)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/josyc.v6i1.5873

Abstract

When shopping, buyers often have difficulty finding daily necessities. One of the causes of this is because the product arrangement process in minimarkets is still carried out randomly and does not match consumer shopping patterns. On the contrary, buyers usually want to buy products through daily necessities packages, but these packages are usually not yet available in minimarkets. Identifying relationship patterns in minimarket transaction data can help overcome product arrangement and product packaging problems. By using the clustering method, objects are grouped into groups that have many similarities with each other. This method allows the grouping process to be carried out. Some of the methods in clustering include the K-Means and K-medoids methods. The purpose of this study is to group the data on goods in the minimarket which can be a guide for more neatly arranged product planning. Data grouping is divided into 3 categories namely slow, medium and fast. The results obtained show that the two algorithms produce different Davies-Bouldin Index values, with the K Medoids algorithm obtaining a lower value of 0.50387 while K-Means obtains a value of 0.50391 where the K-Medoids clustering results have better quality compared to K-Means. With the results of the grouping of these goods data, minimarkets can balance the stock of goods to prevent excess or shortage of inventory of these goods.
Perbandingan Kinerja Algoritma Decision Tree dan Random Forest untuk Klasifikasi Nutrisi pada Makanan Cepat Saji Yaman, Nuurul Izzati; Juwita, Ayu Ratna; Lestari, Santi Arum Puspita; Faisal, Sutan
Jurnal Algoritma Vol 21 No 2 (2024): Jurnal Algoritma
Publisher : Institut Teknologi Garut

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.33364/algoritma/v.21-2.1649

Abstract

Makanan cepat saji telah menjadi bagian penting dari gaya hidup modern yang sibuk, makanan cepat saji lebih digemari karena membuat makan menjadi mudah dan nyaman. Anak muda zaman sekarang sangat menyukai makanan instan. namun, konsumsi makanan instan yang berlebihan dapat memicu berbagai masalah kesehatan, termasuk pola makan yang obsesif. Hal ini menimbulkan kebutuhan untuk mengembangkan metode analisis yang lebih akurat untuk mengklasifikasikan data nutrisi makanan cepat saji, tujuan klasifikasi adalah untuk memperoleh model pohon keputusan yang dapat digunakan untuk mengantisipasi dan memperhatikan bagaimana variable pada data yang berhubungan satu sama lain. Dalam membandingkan kinerja Algoritma Decision Tree dan Random Forest dalam memproses data nutrisi makanan cepat saji ditemukan bahwa semua variabel memiliki korelasi. Hasil implementasi ditemukan bahwa kedua model memiliki kemampuan yang luar biasa. kinerja Algoritma Decision Tree dan Random Forest pada dataset yang sama, Random Forest mengungguli Decision Tree dengan nilai akurasi 66.67%, sedangkan Decision Tree hanya mencapai 55.56%, menunjukkan bahwa Random Forest mampu memberikan prediksi yang lebih akurat untuk kelas data uji. Selain itu, karakteristik kelompok Random Forest, di mana beberapa pohon keputusan digabungkan, memberikan keunggulan dalam menangani kompleksitas data dan meningkatkan generalisasi model. Hasil ini menunjukkan bahwa pembelajaran kelompok dapat meningkatkan kinerja dan keandalan prediksi dalam membangun model klasifikasi, terutama dalam kasus dataset yang kompleks.