Claim Missing Document
Check
Articles

Found 7 Documents
Search

Classification of risk of death from heart disease or cigarette influence using the k-nearest neighbors (KNN) method Fadhilah, Muhammad Syafiq; Muzayanah, Rini
Journal of Student Research Exploration Vol. 2 No. 2: July 2024
Publisher : SHM Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.52465/josre.v2i2.359

Abstract

Heart disease is one of the leading causes of death in Indonesia. In addition to coronary heart disease, smoking is the leading contributor to the death rate in Indonesia. This study aims to analyze the risk of death with the main variables of heart disease history and smoking history. This study classifies the risk of death of heart disease sufferers and smokers using the KNearest Neighbors (KNN) algorithm. The results showed that the KNN model had an accuracy of 52.38% in predicting the risk of death of smokers and heart disease patients. Confusion matrix analysis revealed that the model performed well in predicting classes 0 and 2, but had difficulty in predicting class 1. This study shows that KNN can be used to predict the risk of death of smokers and patients with heart disease with a satisfactory success rate.
Developed an expert system for analysis of covid-19 affected Mishra, Shashank; Yadav, Shivam; Aggarwal, Mukul; Sharma, Yashika; Muzayanah, Rini
Journal of Soft Computing Exploration Vol. 4 No. 1 (2023): March 2023
Publisher : SHM Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.52465/joscex.v4i1.113

Abstract

The expert system solves problems within a specific area of the knowledge base. Prolog is a logical programming language which works on its knowledge base and effectively can be used to develop an expert system. Covid 19 is a pandemic deices and an expert system can be developed to diagnose this disease with the help of its symptoms that can be used as a knowledge base in Prolog. This expert system can make a fast diagnosis process for the covid 19 which is important to prevent the spread of the virus. Here we developed an expert system using prolog for diagnosis purposes. Like humans, these systems can get better with time as they gain more experience. Expert systems combine their experiences and expertise into a knowledge base that is then used by an inference or rules engine, a set of rules that the software employs, to apply to certain scenarios. Prolog is ideal for use with intelligent systems for a few reasons. Prolog can be viewed as a straightforward theorem prover or inference engine that derives from predefined rules. With the help of Prolog's built-in search and backtracking techniques, simple expert systems can be created.
Optimization of support vector machine using information gain and adaboost to improve accuracy of chronic kidney disease diagnosis Listiana, Eka; Muzayanah, Rini; Muslim, Much Aziz; Sugiharti, Endang
Journal of Soft Computing Exploration Vol. 4 No. 3 (2023): September 2023
Publisher : SHM Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.52465/joscex.v4i3.218

Abstract

Today's database is growing very rapidly, especially in the field of health. The data if not processed properly then it will be a pile of data that is not useful, so the need for data mining process to process the data. One method of data mining used to predict a decision in any case is classification, where in the classification method there is a support vector machine algorithm that can be used to diagnose chronic kidney disease. The purpose of this study is to determine the level of accuracy of the application of information gain and AdaBoost on the support vector machine algorithm in diagnosing chronic kidney disease. The use of information gain is to select the attributes that are not relevant while AdaBoost is used as an ensemble method commonly known as the method of classifier combination. In this study the data used are chronic kidney disease (CKD) dataset obtained from UCI repository of machine learning. The result of experiment using MATLAB applying information gain and AdaBoost on vector machine support algorithm with k-fold cross validation default k = 10 shows an accuracy increase of 0.50% with the exposure of the result as follows, the support vector machine algorithm has accuracy of 99.25 %, if by applying AdaBoost on the support vector machine has an accuracy of 99.50%, whereas if applying AdaBoost and information gain on the support vector machine has an accuracy of 99.75%.
Comparison of gridsearchcv and bayesian hyperparameter optimization in random forest algorithm for diabetes prediction Muzayanah, Rini; Pertiwi, Dwika Ananda Agustina; Ali, Muazam; Muslim, Much Aziz
Journal of Soft Computing Exploration Vol. 5 No. 1 (2024): March 2024
Publisher : SHM Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.52465/joscex.v5i1.308

Abstract

Diabetes Mellitus (DM) is a chronic disease whose complications have a significant impact on patients and the wider community. In its early stages, diabetes mellitus usually does not cause significant symptoms, but if it is detected too late and not handled properly, it can cause serious health problems. To overcome these problems, diabetes detection is one of the solutions used. In this research, diabetes detection was carried out using Random Forest with gridsearchcv and bayesian hyperparameter optimization. The research was carried out through the stages of study literature, model development using Kaggle Notebook, model testing, and results analysis. This study aims to compare GridSearchCV and Bayesian hyperparameter optimizations, then analyze the advantages and disadvantages of each optimization when applied to diabetes prediction using the Random Forest algorithm. From the research conducted, it was found that GridSearchCV and Bayesian hyperparameter optimization have their own advantages and disadvantages. The GridSearchCV hyperparameter excels in terms of accuracy of 0.74, although it takes longer for 338,416 seconds. On the other hand, Bayesian hyperparameter optimization has a lower accuracy rate than GridSearchCV optimization with a difference of 0.01, which is 0.73 and takes less time than GridSearchCV for 177,085 seconds.
Application of the greedy algorithm to maximize advantages of cutting steel bars in the factory construction Muzayanah, Rini; Tama, Endi Adika
Journal of Student Research Exploration Vol. 1 No. 1: January 2023
Publisher : SHM Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.52465/josre.v1i1.112

Abstract

Indonesia is one of the countries that is currently exuberant with development issues in order to balance the ongoing process of global modernization. In the infrastructure development process, the developer will enter into a contract with the contractor. This study aims to analyze the performance of the greedy algorithm in optimizing steel cutting with maximum profit to construction companies. The methods used include literature studies, program design, and program trials where the algorithm used is a greedy one. From the results obtained, it is evident that the Greedy algorithm can provide optimal steel cutting solutions because it works by calculating and deviating from all available separation settings, so there is no need to recalculate if the program performs that step.
Optimizing Customer Segmentation in Online Retail Transactions through the Implementation of the K-Means Clustering Algorithm Awaliyah, Desi Adrianti; Budi Prasetiyo; Muzayanah, Rini; Lestari, Apri Dwi
Scientific Journal of Informatics Vol. 11 No. 2: May 2024
Publisher : Universitas Negeri Semarang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15294/sji.v11i2.6137

Abstract

Purpose: The main objective of this research is optimal use of customer segmentation using the Recency, Frequency and Monetary (RFM) approach so that companies can better understand and comprehend the needs of each customer. By carrying out this segmentation, companies can communicate better and provide services tailored to each customer. Methods: The K-means algorithm is used as the main method for customer segmentation in this research. This research uses a dataset of online retail customers. Apart from that, this research also uses the elbow method to help determine the best number of clusters to be created by the model. Result: Based on the elbow method, the most optimal is to use 3 clusters for this case. Thus, in K-means modeling, forming 3 clusters is the best choice. Clusters produce groups of customers who have specific characteristics in each cluster. The analysis shows that quantity and unit price have a significant influence on online retail customer behavior. Novelty: This research strengthens the trend of using the K-means algorithm for customer segmentation in online retail datasets, which has proven popular in journals from 2018 to 2022. This research creates 3 new variables that will be used by the model to understand the characteristics of customer transaction behavior. This study also emphasizes the importance of exploratory data analysis in understanding data before clustering and the use of the elbow method to determine the most appropriate number of clusters, providing a significant contribution in analyzing customer segmentation.
Comparative Study of Imbalanced Data Oversampling Techniques for Peer-to-Peer Landing Loan Prediction Muzayanah, Rini; Lestari, Apri Dwi; Jumanto, Jumanto; Prasetiyo, Budi; Pertiwi, Dwika Ananda Agustina; Muslim, Much Aziz
Scientific Journal of Informatics Vol 11, No 1 (2024): February 2024
Publisher : Universitas Negeri Semarang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15294/sji.v11i1.50274

Abstract

Purpose: Data imbalances that often occur in the classification of loan data on the Peer-to-Peer Lending platform cancause algorithm performance to be less than optimal, causing the resulting accuracy to decrease. To overcome thisproblem, appropriate resampling techniques are needed so that the classification algorithm can work optimally andprovide results with optimal accuracy. This research aims to find the right resampling technique to overcome theproblem of data imbalance in data lending on peer-to-peer landing platforms.Methods: This study uses the XGBoost classification algorithm to evaluate and compare the resampling techniquesused. The resampling techniques that will be compared in this research include SMOTE, ADACYN, Border Line, andRandom Oversampling.Results: The highest training accuracy was achieved by the combination of the XGBoost model with the Boerder Lineresampling technique with a training accuracy of 0.99988 and the combination of the XGBoost model with the SMOTEresampling technique. In accuracy testing, the combination with the highest accuracy score was achieved by acombination of the XGBoost model with the SMOTE resampling technique.Novelty: It is hoped that from this research we can find the most suitable resampling technique combined with theXGBoost sorting algorithm to overcome the problem of unbalanced data in uploading data on peer-to-peer lendingplatforms so that the sorting algorithm can work optimally and produce optimal accuracy.