Claim Missing Document
Check
Articles

DIAGNOSIS PENDERITA PENYAKIT KANKER PARU MENGGUNAKAN SUPPORT VECTOR MACHINE DAN NAÏVE BAYES Muhammad Iqbal Yunan Helmi; Dian Anggraeni; Alfian Futuhul Hadi
STATISTIKA: Forum Teori dan Aplikasi Statistika Vol 21, No 1 (2021)
Publisher : Program Studi Statistika Unisba

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.29313/jstat.v21i1.7566

Abstract

Menurut data jenis kanker yang menjadi penyebab kematian terbanyak adalah kanker paru, mencapai 1,7 juta kematian pertahun. Penyakit ini disebabkan oleh banyak faktor salah satunya genetika. Dalam penelitian ini akan dilakukan diagnosis kanker paru menggunakan metode Support Vector Machine (SVM) dan Naïve Bayes. Naïve Bayes merupakan teknik prediksi berbasis probabilitas sederhana yang berdasarkan pada model fitur independent, sedangkan klasifikasi menggunakan SVM dapat dijelaskan secara sederhana yaitu usaha untuk mendapatkan hyperplane sebagai fungsi pemisah terbaik yang dapat memisahkan dua kelas yang berbeda pada ruang input. Pada penelitian ini akan dibandingkan metode SVM dan Naive Bayes untuk didapatkan mana metode yang mempunyai akurasi terbaik. Data microarray yang digunakan pada penelitian ini  berupa 80 individu dengan masing-masing jumlah ekspresi genetiknya 2408. Sebanyak 60 individu tergolong ke dalam kelas kanker, dan 20 individu termasuk ke dalam kelas normal. Hasil dari penelitian ini adalah SVM mempunyai nilai akurasi sebesar 90% dan Naïve Bayes mempunyai nilai akurasi sebesar 75%.
Principal Component Regression in Statistical Downscaling with Missing Value for Daily Rainfall Forecasting M Dika saputra; Alfian Futuhul Hadi; Abduh Riski; Dian Anggraeni
International Journal of Quantitative Research and Modeling Vol 2, No 3 (2021)
Publisher : Research Collaboration Community (RCC)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.46336/ijqrm.v2i3.151

Abstract

Drought is a serious problem that often arises during the dry season. Hydrometeorologically, drought is caused by reduced rainfall in a certain period. Therefore, it is necessary to take the latest actions that can overcome this problem. This research aims to predict the potential for a drought to occur again in the Kupang City, Indonesia by developing a rainfall forecasting model. Incomplete daily local climate data for Kupang City is an obstacle in this analysis of rainfall forecasting. Data correction was then carried out through imputed missing values using the Kalman Filter method with Arima State-Space model. The Kalman Filter and Arima State-Space model (2,1,1) produces the best missing data imputation with a Root Mean Square Error (RMSE) of 0.930. The rainfall forecasting process is carried out using Statistical Downscaling with the Principal Component Regression (PCR) model that considers global atmospheric circulation from the Global Circular Model (GCM). The results showed that the PCR model obtained was quite good with a Mean Absolute Percent Error (MAPE) value of 2.81%. This model is used to predict the daily rainfall of Kupang City by utilizing GCM data.
PENERAPAN METODE EXTENDED KALMAN FILTER PADA KASUS PERTUMBUHAN PENDUDUK KABUPATEN JEMBER Rory Ronella Agustin; Kosala Dwidja Purnomo; Alfian Futuhul Hadi
MathVisioN Vol 1 No 02 (2019): September 2019
Publisher : Prodi Matematika FMIPA Unirow Tuban

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (182.416 KB)

Abstract

This study discusses the estimated number of people using the methods of Jember Regency Extended Kalman Filter (EKF) and determine the appropriate logistic growth model for predicting the next populations in Jember. There are two assumptions logistic growth model will be compared, first is logistic growth model assuming a linear populations function and the second is logistic growth model assuming parabolic populatins function. To determine efficiency of Extended Kalman Filter conducted trial process, using 6, 14, 28 measurements data. Each data taken from Central Statistic Agency of East Java Province during 1990-2017. Finally, this study indicate that the logistic growth model assuming parabolic populations function is an appropiate better than logistic growth model assuming a linear populations for populations in Jember during 1990-2017. The Extended Kalman Filter method is able to increase the confidence level of the estimation results indicated by getting smaller of average norm covariance error. More data used, the estimation results using Extended Kalman Filter method are getting better and closer to the real data.
DIVERSIFIKASI USAHA KELOMPOK PENJUAL DURIAN MELALUI OLAHAN LIMBAH BUAH DURIAN Halimatus Sa'diyah; Alfian Futuhul Hadi; Nasrul Ilminnafik
Martabe : Jurnal Pengabdian Kepada Masyarakat Vol 5, No 2 (2022): Martabe : Jurnal Pengabdian Kepada Masyarakat
Publisher : Universitas Muhammadiyah Tapanuli Selatan

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.31604/jpm.v5i2.550-558

Abstract

Desa Pakusari Krajan di Kabupaten Jember memiliki pusat penjualan durian berupa deretan kios-kios penjual buah durian segar yang menetap. Saat ini, banyak pembeli durian yang menikmati buah durian di kios penjual, mengakibatkan banyak limbah durian menumpuk berupa biji dan kulit, sehingga lingkungan menjadi kotor. Para penjual durian, sebagi mitra dalam kegiatan ini, hanya menjual satu macam produk saja, belum memiliki pengetahuan untuk meningkatkan usaha dagangnya, serta belum ada managemen usaha. Kegiatan pengabdian masyarakat ini bertujuan menyelesaikan permasalahn tersebut melalui pemanfaatan limbah durian menjadi olahan produk makanan dan minuman, serta pengaplikasian mesin teknologi tepat guna untuk membantu proses pengolahan tersebut. Adanya produk olahan limbah durian tersebut dapat menganekaragamkan produk yang dijual para penjual durian. Tujuan lain yang ingin dicapai adalah memberi pengetahuan mitra tentang managemen usaha. Metode yang digunakan adalah penyuluhan dan pelatihan, praktek pengolahan maupun praktek penggunaan teknologi tepat guna dalam proses produksi. Mitra mengikuti kegiatan dengan penuh antusias. Semua program yang telah dilaksanakan diiukuti dengan baik oleh mitra, diharapkan memberikan pengaruh positif bagi mitra baik dalam, aspek pengembangan ipteks, ekonomi maupun lingkungan. Kegiatan ini mendorong adanya diversifikasi produk, dimana mitra menjual olahan limbah durian selain durian segar yang biasa mereka jual. Lingkungan juga menjadi lebih bersih bebas limbah durian. Mitra juga leebih memahami pentingnya managemen usaha, antara lain melalui pembukuan kas sederhana.
Rancang Bangun Data Warehouse dan R Studio Serta Pemanfaatanya dalam Peramalan Pola Konsumsi Masyarakat di Kabupaten Jember Lutfi Ali Muharom; Alfian Futuhul Hadi; Dian Anggraeni
JUSTINDO (Jurnal Sistem dan Teknologi Informasi Indonesia) Vol 1, No 1 (2016): JUSTINDO
Publisher : Universitas Muhammadiyah Jember

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.32528/justindo.v1i01.244

Abstract

As we know that we have to process and store the data recording well. Data warehouse is one of data processing method that use to support the decission-making process. The data warehouse process started from colecting, selecting, designing and uploading data in to data warehouse. In this research, we use the data of SUSENAS from year of 1997 until 2012. We took the daily consumption data (household expendature) to be proceed in data warehouse. The implementation of web based R studio program can facilitate the users to acces R . R can be accessed by any kind of devices which have browser and internet acces by any kind of devices which have browse and internet acces. The connectivity of R studio to data warehouse can be simplify the users to access and process the data. As the result of consumption patterns (staple food) forecasting in jember, we conclude that the best forecasting method for forecasting method for forecasting using AR(1) model. The limited data collections caused the ensemble wouldn’t become the best method , whereas, it should be the best method.
Keterampilan Statistika dan Data Science: Manfaatnya di Berbagai Bidang pada Era Digital Alfian Futuhul Hadi; Halimatus Sa'diyah
Abdimas Universal Vol. 4 No. 2 (2022): Oktober
Publisher : Lembaga Penelitian dan Pengabdian kepada Masyarakat Universitas Balikpapan (LPPM UNIBA)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.36277/abdimasuniversal.v4i2.245

Abstract

Data Scientist is a trending career in recent years. The survey results show that the need for a data scientist is high, but the availability is very low with limited capabilities. A data scientist needs statistics and programming skills. But, from a student's point of view, statistics is considered the same as mathematics, so it is less desirable because it is mathematical and is considered difficult. Students as prospective workers need to have insight into how to work and skills in the era of the industrial revolution 4.0. Failure to adapt to this new era will lead to increased unemployment of working age in the future. It is important to make the younger generation understand about statistics and data science skills, including the opportunities and job that exist, so efforts are needed to disseminate them. It is hoped that after understanding it, the younger generation will be interested in choosing a profession related to digital. The target audience is students of SMAN 3 Jember. The method used is presentation, then evaluation using a questionnaire. More than 79% of students rated the material presented as important, interesting and up to date. Students can also understand most of the material given, and the level of student interest in both fields is very high. This activity was generally successful.
THE GGE BIPLOT ON RCIM MODEL FOR ASSESSING THE GENOTYPE-ENVIRONMENT INTERACTION WITH SIMULATING OUTLIERS: ROBUSTNESS IN R-SQUARED PROCRUSTES Alfian Futuhul Hadi; Halimatus Sa'diyah; Dimas Bagus Cahyaningrat Wicaksono
MEDIA STATISTIKA Vol 15, No 2 (2022): Media Statistika
Publisher : Department of Statistics, Faculty of Science and Mathematics, Universitas Diponegoro

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.14710/medstat.15.2.209-219

Abstract

The genotype by environment interaction (GEI) analysis was usually done by Additive Main Effects and Multiplicative Interaction (AMMI) model with Biplot features, and recently there was a Row Column Interaction Model (RCIM) alternatively. In the Biplot of genotype (G) and genotype by environment (GE) interactions, known as the GGE Biplot, the main effect of environment (E) was deleted, while the main effect of G and the interaction effect of GE is kept and combined. Subsequently, continuing our recent research of the robustness of the GGE Biplot in RCIM models, this paper aims to develop the GGE Biplot by RCIM model to analyze the GEI with outlying observations. We used the RCIM model with Asymptotic Laplace Distribution (ALD) that was applied on the simulated data with scattered and single environment outliers to evaluate the robustness of the GGE Biplot. In addition, the robustness was evaluated using the R-squared statistic of the Procrustes analysis. It is shown that the GGE Biplot of RCIM with the ALD family function provides better robustness than the Gaussian. A noticeable superiority of the GGE Biplot with RCIM ALD appeared as the percentage of single environment outliers reach the number of rows of the data matrix.
Application of SHAP on CatBoost classification for identification of variabels characterizing food insecurity occurrences in Aceh Province households MUHAMMAD SUBIANTO; INA YATUL ULYA; EVI RAMADHANI; BAGUS SARTONO; ALFIAN FUTUHUL HADI
Jurnal Natural Volume 23 Number 3, October 2023
Publisher : Universitas Syiah Kuala

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24815/jn.v23i3.33548

Abstract

Classification is the process of building a model that can distinguish between different classes of data. The model aims to predict the class of testing data based on patterns or relationships learned from training data. One of the data processing algorithms used to build classification models is Categorical Boosting (CatBoost). However, in general, the resulting models are difficult to interpret. To facilitate the interpretation of complex classification models, methods such as SHAP (SHapley Additive exPlanations) are needed. SHAP is a method to explain individual predictions. SHAP is based on the game theoretically optimal shapley values. In this study, an analysis of important SHAP variables was conducted on the CatBoost classification model to identify variables characterizing occurrences of food insecurity in households. The data used in this study was obtained from the Survei Sosial Ekonomi Nasional (Susenas) in March 2021 in Aceh Province, sourced from the Badan Pusat Statistik (BPS). There are 13,126 observations in the research data. The results from four evaluated classification models on the testing data showed that the best model had accuracy, sensitivity, specificity, and AUC values of 0.703, 0.349, 0.798, and 0.637, respectively. Furthermore, the results of the analysis of important SHAP variables showed that the variables number of household members who smoke ( ), education of the household head ( ), wall types ( ), drinking water source ( ), and decent sanitation ( ) significantly contributed to the occurrences of food insecurity in households in Aceh Province in the year 2021.
Survival Analysis of Sea Turtles Eggs Hatching Success using Cox non Proportional Hazard Regression Forestryani, Veniola; Fatekurohman, Mohamad; Hadi, Alfian Futuhul
Jurnal ILMU DASAR Vol 20 No 1 (2019)
Publisher : Fakultas Matematika dan Ilmu Pengetahuan Alam Universitas Jember

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (343.428 KB) | DOI: 10.19184/jid.v20i1.6531

Abstract

The aims of this research is to know both the model and also the factors of incubation period and hatching success of eggs of sea turtles in Kuta, Legian and Seminyak Beach, Bali from January to September 2016. The reasearch was conducted by doing survival analysis by using Cox Non Proportional Hazard regression and then compare the model derived from it with log-logistic regression model. Precipitation, location, temperature, humidity, and hours of daylight are the factors which significantly influence incubation period and hatching success of eggs of sea turtles. According to the descriptive analysis, 12≤ precipitaion <18, Seminyak Beach, 28,5≤ temperature <29,5, 86≤ humidity ≤91, and 5,8≤ hours of daylight <8,3 are the factors which have highest percentage of hatching success. Meanwhile 12≤ precipitation <18, Seminyak Beach, 28,5≤ temperature <29,5, 86≤ humidity ≤91, and 0,8≤ hours of daylight <3,3 are the factors which have highest percentage of hatching success based on the hazard value. Although Seminyak Beach has the highest rate of hatching success, it’s not significantly different from Legian beach in respect to the location factor’s categories. Keywords: hatching success, cox non proportional hazard, log-logistic, survival analysis
Simple House Needs in Jember with Robust Small Area Estimation Murtinasari, Frida; Hadi, Alfian Futuhul; Anggraeni, Dian
Jurnal ILMU DASAR Vol 18 No 1 (2017)
Publisher : Fakultas Matematika dan Ilmu Pengetahuan Alam Universitas Jember

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (694.747 KB) | DOI: 10.19184/jid.v18i1.3159

Abstract

SAE (Small Area Estimation) is often used by researchers, especially statisticians to estimate parameters of a subpopulation which has a small sample size. Empirical Best Linear Unbiased Prediction (EBLUP) is one of the indirect estimation methods in Small Area Estimation. The presence of outliers in the data can not guarantee that these methods yield precise predictions . Robust regression is one approach that is used in the model Small Area Estimation. Robust approach in estimating such a small area known as the Robust Small Area Estimation. Robust Small Area Estimation divided into several approaches. It calls Maximum Likelihood and M- Estimation. From the result, Robust Small Area Estimation with M-Estimation has the smallest RMSE than others. The value is 1473.7 (with outliers) and 1279.6 (without outlier). In addition the research also indicated that REBLUP with M-Estimation more robust to outliers. It causes the RMSE value with EBLUP has five times to be large with only one outlier are included in the data analysis. As for the REBLUP method is relatively more stable RMSE results.