Claim Missing Document
Check
Articles

Found 35 Documents
Search

PERBANDINGAN AKURASI KLASIFIKASI MENGGUNAKAN ALGORITMA QUEST PADA PADA SKENARIO DATA KODIFIKASI DAN NON-KODIFIKASI Surya Prangga; Rito Goejantoro; Memi Nor Hayati; Siti Mahmuda; Dwi Husnul Mubiin
Jurnal Lebesgue : Jurnal Ilmiah Pendidikan Matematika, Matematika dan Statistika Vol. 5 No. 1 (2024): Jurnal Lebesgue : Jurnal Ilmiah Pendidikan Matematika, Matematika dan Statistik
Publisher : LPPM Universitas Bina Bangsa

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.46306/lb.v5i1.525

Abstract

Traffic accidents are difficult to predict in terms of when and where will occur. The number of traffic accident cases in Indonesia is relatively high. Regarding on data from the Central Statistics Agency (Badan Pusat Statistik) from 2020 until 2021, the average number of traffic accidents reaches one hundred thousand cases every year. Especially, in the Samarinda City, which is the capital of East Kalimantan Province, it ranked the highest in 2020 compared to several other regencies and cities within East Kalimantan Province. Considering these facts, traffic accident cases need to be addressed to minimize accident-related casualties. One data mining technique used to analyze traffic accident patterns is the decision tree-based classification method. One of the decision tree-based classification methods is QUEST algorithm. The QUEST algorithm (Quick, Unbiased, Efficient, and Statistical Tree) can be used to classify the status of traffic accident victims. Based on data analysis, the best accuracy to classify the status of traffic accident victims was obtained using second scenario data with 80:20 data split, with an accuracy of 66,10% and an F1-Score of 62,96%.
PENGELOMPOKAN KABUPATEN/KOTA DI PULAU KALIMANTAN PADA TAHUN 2020 DAN 2021 BERDASARKAN INDEKS PEMBANGUNAN MANUSIA MENGGUNAKAN METODE ALGORITMA ST-DBSCAN Binda Aprilia Suryani; Memi Nor Hayati; Surya Prangga
JURNAL RISET PEMBANGUNAN Vol 6, No 1 (2023)
Publisher : BADAN PENELITIAN DAN PENGEMBANGAN DAERAH PROVINSI KALIMANTAN TIMUR

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.36087/jrp.v6i1.139

Abstract

Clustering merupakan suatu teknik menganalisis pengelompokan berbeda terhadap data. Spatial Temporal-Density Based Spatial Clustering of Applications with Noise (ST-DBSCAN) adalah algoritma pengelompokan berbasis kepadatan (density) yang memiliki kemampuan untuk mencari pengelompokan berdasarkan data spasial, data temporal, dan data non-spasial dari objek. Tujuan dari penelitian ini adalah untuk mengetahui nilai Silhouette Coefficient (SC) dan hasil pengelompokan data IPM pada 56 Kabupaten/Kota di Pulau Kalimantan. Perhitungan nilai SC melibatkan parameter Eps1 untuk data spasial, Eps2 untuk data temporal dan MinPts. Nilai Parameter Eps1 dan MinPts disimulasikan secara trial and error untuk menghasilkan nilai SC terbesar. Nilai parameter Eps1= 1 sampai dengan Eps1= 5 Eps2 = 2 dan MinPts= 4 sampai dengan MinPts= 6. Berdasarkan hasil nilai SC yang tertinggi pada pengelompokan kabupaten/kota di Pulau Kalimantan menggunakan algoritma ST-DBSCAN adalah 0,324 yaitu terbentuk sebanyak 2 cluster dengan cluster pertama beranggotakan 42 kabupaten/kota dan cluster nol atau outlier beranggotakan 14 Kabupaten/Kota.
Pengelompokan Kabupaten/Kota Di Pulau Kalimantan Berdasarkan Indikator Indeks Pembangunan Manusia Tahun 2020 Menggunakan Optimasi K-Means Cluster Dengan Principle Component Analysis (PCA) Anwar, Khoiril; Goejantoro, Rito; Prangga, Surya
EKSPONENSIAL Vol. 13 No. 2 (2022)
Publisher : Program Studi Statistika FMIPA Universitas Mulawarman

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (964.676 KB) | DOI: 10.30872/eksponensial.v13i2.1053

Abstract

Data mining is a technique or process to obtain useful information from a large database. Based on its functionality, one of the tasks of data mining is to group data. Cluster analysis is an analysis that aims to group objects based on the information found in the data. One of the cluster analysis methods is the K-Means cluster method, which is a non-hierarchical grouping method by dividing the data set into a number of groups that do not overlap between one group and another. This study aims to classify districts/cities on the island of Kalimantan based on indicators of the human development index and obtain the sillhoutte coefficient value from the optimal cluster analysis using the K-Means algorithm on principle component analysis. The data used is the 2020 human development index data in districts / cities on the island of Kalimantan and used 8 variables from the human development index indicator. The results of the optimal cluster formed in the grouping of regencies/cities on the island of Kalimantan using the K-Means cluster method on the principle component analysis are 4 clusters. Cluster 1 has 20 regencies/cities, cluster 2 has 3 regencies/cities, cluster 3 has 26 regencies/cities and cluster 4 has 7 regencies/cities. The sillhoutte coefficient value for data validation from district/city clustering on the island of Kalimantan using the K-Means cluster method on principle component analysis produces 4 clusters of 0.540 which states that the cluster structure formed in this grouping is a medium structure.
Aplikasi K-Nearest Neighbor Dengan Fungsi Jarak Gower Dalam Klasifikasi Kelulusan Mahasiswa: Studi Kasus : Mahasiswa Program Studi Statistika, Jurusan Matematika, Fakultas Matematika Dan Ilmu Pengetahuan Alam, Universitas Mulawarman Fadil, Irfan; Goejantoro, Rito; Prangga, Surya
EKSPONENSIAL Vol. 13 No. 1 (2022)
Publisher : Program Studi Statistika FMIPA Universitas Mulawarman

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (649.085 KB) | DOI: 10.30872/eksponensial.v13i1.881

Abstract

The results of the reaccreditation of the Statistics Study Program, Mulawarman University in 2019 remain accredited B. One of the assessment indicators used in reaccreditation is the student's timely graduation status. Therefore, it is necessary to predict the graduation status of Statistics students, Mulawarman University.. The prediction method used in this research is K-Nearest Neighbor (K-NN). K-NN is a classification method based on studying previously classified data. This method is very easy to understand, easy to applied and also non-parametric method, so that no certain assumptions are needed in the process. The independent variables used in this study were student profiles, including gender, regional origin, cumulative Grade Point Average (GPA) and single tuition fee. The dependent variable in this study is the graduation status of students, namely graduating on time and not graduating on time. The data used were students of the Mulawarman University, Statistics Study Program in 2014, 2015, and 2016. The results showed at k = 7 and the distribution of training and testing data with the proportion of 80:20 obtained optimal accuracy of 0,909 with a TPrate of 0.500, a TNrate. in the amount of 1,000 and AUC value of 0,75 that means fair classification.
Optimasi Fuzzy C-Means Menggunakan Particle Swarm Optimization Untuk Pengelompokan Kabupaten/Kota Di Pulau Kalimantan (Studi Kasus : Data Indikator Kesejahteraan Rakyat Tahun 2020) Febriyanti, Nur Afifah; Goejantoro, Rito; Prangga, Surya
EKSPONENSIAL Vol. 14 No. 1 (2023)
Publisher : Program Studi Statistika FMIPA Universitas Mulawarman

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (1148.931 KB) | DOI: 10.30872/eksponensial.v14i1.1095

Abstract

Fuzzy C-Means (FCM) is a method of grouping data based on the degree of membership whose observation object is based on the information found in the data describing the object. The FCM method has weaknesses in the initial cluster center determination, so it can be overcome by the Particle Swarm Optimization (PSO) method that can be applied to find the optimal solution of the optimal cluster center determination. The purpose of this research is to determine the optimal number of clusters based on the validity indexes of Partition Coefficient (PC) and Modified Partition Coefficient (MPC), and obtain the results of grouping regencies/cities using the FCMPSO method. Based on the FCMPSO method with a validity index of PC and MPC, it produces an optimal cluster of two clusters, the first cluster consisting of 33 regencies/cities on Kalimantan Island and the second cluster consisting of 23 regencies/cities on Kalimantan Island.
Klasifikasi Status Pembayaran Kredit Barang Elektronik dan Furniture Menggunakan Support Vector Machine Casuarina, Indah Putri; Hayati, Memi Nor; Prangga, Surya
EKSPONENSIAL Vol. 13 No. 1 (2022)
Publisher : Program Studi Statistika FMIPA Universitas Mulawarman

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (558.5 KB) | DOI: 10.30872/eksponensial.v13i1.887

Abstract

Classification is the process of finding a model or function that can describe and differentiate data into classes. One application of classification is Support Vector Machine (SVM). SVM is a learning system that uses a hypothetical space in the form of linear functions in a high-dimensional feature space, trained with a learning algorithm based on optimization theory by implementing machine learning derived from statistical learning theory. The concept of classification with SVM is to find the best hyperplane to separate the two data classes and use a support vector approach. This study uses the proportion of the distribution of training data and testing data, namely 50%:50%, 70%:30%, 90%:10% and uses the SVM algorithm Polynomial kernel function with parameters =0.01, r=0.5, d =2, and C=1. This study aims to determine the results of the classification of the credit payment status of electronic goods and furniture and the level of classification accuracy in the SVM method. The data used is the debtor data of PT. KB Finansia Multi Finance Bontang in 2020 as many as 133 data with current and non-current credit payment status and using 7 independent variables, namely age, number of dependents, length of stay, income, years of service, large credit payments, and length of credit borrowing. The results of the SVM classification show an average accuracy value of 72.25% and the best accuracy chosen is the proportion of training data distribution and testing data 90%:10%, which is 84.62%.
Perbandingan Algoritma C4.5 Dan Naïve Bayes Untuk Prediksi Ketepatan Waktu Studi Mahasiswa: Studi Kasus: Program Studi Statistika Universitas Mulawarman Permana, Jordan Nata; Goejantoro, Rito; Prangga, Surya
EKSPONENSIAL Vol. 13 No. 2 (2022)
Publisher : Program Studi Statistika FMIPA Universitas Mulawarman

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (1043.881 KB) | DOI: 10.30872/eksponensial.v13i2.947

Abstract

Classification is a statistical technique that aims to classify data into classes that already have labels by building a model based on training data. There are many methods that can be used in the classification including Naïve Bayes and C4.5. The C4.5 algorithm is an algorithm used to form a decision tree while Naïve Bayes is a classification based on probability. This study aims to determine the results of the classification of C4.5 and Naïve Bayes and to determine the classification accuracy of the two methods. The variables used in this study were graduation status , entrance , gender , regional origin , GPA , and UKT group . After the analysis, the results showed that the average accuracy level of the C4.5 algorithm was 61.99% and the Naïve Bayes accuracy level was 69.97%. So it can be said that the Naïve Bayes method is a better method in classifying student status compared to the C4.5 . method.
Optimalisasi K-Means Cluster dengan Principal Component Analysis pada Pengelompokan Kabupaten/Kota di Pulau Kalimantan Berdasarkan Indikator Tingkat Pengangguran Terbuka Rais, Muhammad; Goejantoro, Rito; Prangga, Surya
EKSPONENSIAL Vol. 12 No. 2 (2021)
Publisher : Program Studi Statistika FMIPA Universitas Mulawarman

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (553.224 KB) | DOI: 10.30872/eksponensial.v12i2.805

Abstract

Data mining or often also called knowledge discovery in databases is an activity that includes collecting, using historical data to find regularity, patterns, or relationships in large data sets resulting in useful new information. Cluster analysis is an analysis that aims to group data based on its likeness. This research uses the K-Means method combined with PCA. The K-Means method groups data in the form of one or more clusters that share the same characteristics. While the PCA method was used to reduce research variables. This grouping method was applied to the data indicator of the unemployment rate of districts/cities in Kalimantan Island in 2018. The cluster validation used in this study was the Davies-Bouldin Index (DBI). Based on the results of the analysis, it was concluded that the number of principal components formed was as many as 2 principal components. The most optimal grouping of districts/cities in Kalimantan island in 2018 was to use 2 clusters with a DBI value of 0,507. The grouping of districts/cities in Kalimantan Island in 2018 produced 2 clusters, cluster 1 consisting of 51 districts/cities and clusters of 2 consisting of 5 districts/cities. Cluster 1 was a cluster that has the highest percentage of the poor population and the highest labor force participation rate when compared to cluster 2. While cluster 2 was a cluster that has an index value of human development, population, number of the labor force, number of unemployed, population density, and the minimum wage of district/city was high compared to cluster 1.
Pengelompokan Judul Laporan Skripsi Berbasis Text Mining dengan Metode Fuzzy K-Means Nur Azizah, Noviani; Purnamasari, Ika; Prangga, Surya
METIK JURNAL Vol 8 No 1 (2024): METIK Jurnal
Publisher : LPPM Universitas Mulia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47002/metik.v8i1.808

Abstract

Text mining merupakan salah satu cabang dari data mining. Text mining dapat menganalisa dokumen, menentukan kesamaan di antara dokumen dan mengelompokkan dokumen. Pengelompokan dokumen dapat dilakukan melalui metode text mining yang dapat dikombinasikan dengan fuzzy k-means. Fuzzy k-means mampu menempatkan suatu data dimana data tersebut masuk sebagai anggota keseluruhan klaster berdasarkan derajat keanggotaan yang terletak di interval [0,1], serta dapat menunjukkan hasil penempatan klaster yang lebih akurat. Tujuan penelitian ini adalah menentukan kelompok optimal dan hasil pengelompokan yang terbentuk pada judul laporan skripsi mahasiswa Program Studi Statistika, Jurusan Matematika, Fakultas Matematika dan Ilmu Pengetahuan Alam Universitas Mulawarman tahun 2020-2022. Pada penelitian ini menggunakan davies-bouldin index sebagai uji validasi hasil pengelompokan. Berdasarkan hasil analisis, kelompok optimal yang terbentuk adalah klaster enam dengan nilai davies-bouldin index sebesar 3,646. Terdapat 6 kelompok dari hasil analisis dengan rincian klaster ke-1 sebanyak 24 judul laporan skripsi, klaster ke-2 sebanyak 8 judul laporan skripsi, klaster ke-3 sebanyak 17 judul laporan skripsi, klaster ke-4 sebanyak 39 judul laporan skripsi, klaster ke-5 sebanyak 17 judul laporan skripsi dan klaster ke-6 sebanyak 29 judul laporan skripsi.
PERAMALAN PEREDARAN UANG KARTAL DI INDONESIA MENGGUNAKAN MODEL HYBRID SARIMAX-NEURAL NETWORK Juliarto, Handy Kurniawan; Purnamasari, Ika; Prangga, Surya
Jurnal Gaussian Vol 12, No 4 (2023): Jurnal Gaussian
Publisher : Department of Statistics, Faculty of Science and Mathematics, Universitas Diponegoro

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.14710/j.gauss.12.4.465-476

Abstract

Stability in the economy is influenced by technological advancements, which impact the digitization of the economy and lead to an increasing demand for electronic and digital payment systems compared to physical currency. There are certain months, such as during year-end holidays, when the circulation of physical currency increases. This study purpose to forecasting the total currency circulation in Indonesia, considering the influence of calendar variations, using a hybrid method that combines SARIMAX and NN. The SARIMAX method was utilized to capture linear effects related to calendar variations, while the NN method was employed to capture nonlinear patterns. The analysis results indicated that the hybrid SARIMAX-NN model with 1 to 3 neurons yielded accurate forecasts, with Mean Absolute Percentage Error (MAPE) values below 2%. However, the highest accuracy was achieved by the SARIMAX-NN hybrid model with 1 neuron, which had a MAPE of 1.38%. Additionally, the forecasting results showed a consistent monthly increase, particularly during the holiday season in December