Garuda - Garba Rujukan Digital

Application of C4.5 Algorithm Using Synthetic Minority Oversampling Technique (SMOTE) and Particle Swarm Optimization (PSO) for Diabetes Prediction Damayanti, Dela Rista; Purwinarko, Aji
Recursive Journal of Informatics Vol 2 No 1 (2024): March 2024
Publisher : Universitas Negeri Semarang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15294/rji.v2i1.64928

Abstract. Diabetes is the fourth or fifth leading cause of death in most developed countries and an epidemic in many developing countries. Early detection can be a preventive measure that uses a set of existing data to be processed through data mining with a classification process. Purpose: Investigate the efficacy of integrating the C4.5 algorithm with Synthetic Minority Oversampling Technique (SMOTE) and Particle Swarm Optimization (PSO) for improving the accuracy of diabetes prediction models. By employing SMOTE, the study aims to address the class imbalance issue inherent in diabetes datasets, which often contain significantly fewer instances of positive cases (diabetes) than negative cases (non-diabetes). Furthermore, by incorporating PSO, the research seeks to optimize the decision tree construction process within the C4.5 algorithm, enhancing its ability to discern complex patterns and relationships within the data. Methods/Study design/approach: This study proposes the use of the C4.5 classification algorithm by applying the synthetic minority oversampling technique (SMOTE) and particle swarm optimization (PSO) to overcome problems in the diabetes dataset, namely the Pima Indian Diabetes Database (PIDD). Result/Findings: From the research results, the accuracy obtained in applying the C4.5 algorithm without the preprocessing process is 75.97%, while the results of the SMOTE application of the C4.5 algorithm are 80%. Meanwhile, applying the C4.5 algorithm using SMOTE and PSO produces the highest accuracy, with 82.5%. This indicates an increase of 6.53% from the classification results using the C4.5 algorithm. Novelty/Originality/Value: This research contributes novelty by proposing a hybrid approach that combines the C4.5 decision tree algorithm with two advanced techniques, Synthetic Minority Oversampling Technique (SMOTE) and Particle Swarm Optimization (PSO), for the prediction of diabetes. While previous studies have explored the application of machine learning algorithms for diabetes prediction, few have examined the synergistic effects of integrating SMOTE and PSO with the C4.5 algorithm specifically.

Sentiment Analysis on Twitter Social Media Regarding Covid-19 Vaccination with Naive Bayes Classifier (NBC) and Bidirectional Encoder Representations from Transformers (BERT) Saputra, Angga Riski Dwi; Prasetiyo, Budi
Recursive Journal of Informatics Vol 2 No 2 (2024): September 2024
Publisher : Universitas Negeri Semarang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15294/rji.v2i2.67502

Abstract. The Covid-19 vaccine is an important tool to stop the Covid-19 pandemic, however, there are pros and cons from the public regarding this Covid-19 vaccine. Purpose: These responses were conveyed by the public in many ways, one of which is through social media such as Twitter. Responses given by the public regarding the Covid-19 vaccination can be analyzed and categorized into responses with positive, neutral or negative sentiments. Methods: In this study, sentiment analysis was carried out regarding Covid-19 vaccination originating from Twitter using the Naïve Bayes Classifier (NBC) and Bidirectional Encoder Representations from Transformers (BERT) algorithms. The data used in this study is public tweet data regarding the Covid-19 vaccination with a total of 29,447 tweet data in English. Result: Sentiment analysis begins with data preprocessing on the dataset used for data normalization and data cleaning before classification. Then word vectorization was performed with TF-IDF and data classification was performed using the Naïve Bayes Classifier (NBC) and Bidirectional Encoder Representations from Transformers (BERT) algorithms. From the classification results, an accuracy value of 73% was obtained for the Naïve Bayes Classifier (NBC) algorithm and 83% for the Bidirectional Encoder Representations from Transformers (BERT) algorithm. Novelty: A direct comparison between classical models such as NBC and modern deep learning models such as BERT offers new insights into the advantages and disadvantages of both approaches in processing Twitter data. Additionally, this study proposes temporal sentiment analysis, which allows evaluating changes in public sentiment regarding vaccination over time. Another innovation is the implementation of a hybrid approach to data cleansing that combines traditional methods with the natural language processing capabilities of BERT, which more effectively addresses typical Twitter data issues such as slang and spelling errors. Finally, this research also expands sentiment classification to be multi-label, identifying more specific sentiment categories such as trust, fear, or doubt, which provides a deeper understanding of public opinion.

Comparison of Naive Bayes Classifier and K-Nearest Neighbor Algorithms with Information Gain and Adaptive Boosting for Sentiment Analysis of Spotify App Reviews Saputro, Meidika Bagus; Alamsyah, Alamsyah
Recursive Journal of Informatics Vol 2 No 1 (2024): March 2024
Publisher : Universitas Negeri Semarang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15294/rji.v2i1.68551

Abstract. At this time, the development of technology are increase rapidly. One of the issue that appear with advance technology is data volume in the world has increase too. With the large data volumes that exist in the world it can be used to some purpose in many field. Entertainment is one of the field that have many interest from user in this world. Spotify is the example of entertainment apps that provided by Google Play Store to give online music streams to their users. Because that apps is provided by Google Play Store, many reviews of the user about the apps it can be classified to know the positive, negative, or neutral. One way to classified the review of user is make sentiment analysis. In this paper, to classify the review we use naïve Bayes classifier and k-nearest neighbors that will be compared with adding Information gain as feature selection and adaptive boosting as boosting algorithm of each classification algorithm that we used. The result of classification using naïve Bayes classifier with adding Information gain and adaptive boosting is 87.28% and k-nearest neighbor with adding information gain and adaptive boosting can perform accuracy of 80.35%. Purpose: Knowing the result each of accuracy from the naïve Bayes classifier and k-nearest neighbor algorithm with adding information gain and adaptive boosting that we used and know how to doing the sentiment analysis step by step with the methods that chosen in this study. Methods/Study design/approach: This study applied data preprocessing, lexicon based labelling with TextBlob, Normalization, Word Vectorization using TF-IDF, and classification with naïve Bayes classifier and k-nearest neighbor, information gain as feature selection, and adaptive boosting as boosting algorithm to boost the accuracy of classification result. Result/Findings: The accuracy of naïve Bayes classifier with adding information gain and adaptive boosting is 87.28%. Meanwhile, by k-nearest neighbor with adding information gain and adaptive boosting reach the accuracy of 80.35%. This result obtained by using 60.000 dataset with data splitting 80% as data training and 20% as data testing. Novelty/Originality/Value: Implementing information gain as feature selection and adaptive boosting as boosting algorithm to naïve Bayes classifier is prove that it can be increase the accuracy of classification, but not same when implementing in k-nearest neighbor. So, for the future research can applied another classification algorithm or feature selection to get better result.

Fruit Freshness Detection Using Android-Based Transfer Learning MobileNetV2 Muttaqin, Irfan Fajar; Arifudin, Riza
Recursive Journal of Informatics Vol 2 No 1 (2024): March 2024
Publisher : Universitas Negeri Semarang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15294/rji.v2i1.70845

Abstract. Fruit is an important part of the source of food nutrition in humans. Fruit freshness is one of the most important factors in selecting fruit that is suitable for consumption. Fruit freshness is also an important factor in determining the price of fruit in the market. So it is very necessary to detect fruit freshness which can be done by machine. Take apples, bananas, and oranges as samples. The machine learning algorithm used in this study uses MobileNetV2 with transfer learning techniques. MobileNetV2 introduces many new ideas aimed at reducing the number of parameters to make it more efficient to run on mobile devices and achieve high classification accuracy. Transfer learning is used so that data does not need training from the start, so it only takes several networks from MobileNetV2 that have previously been trained and then retrained with a different purpose to improve accuracy results. Then the models that have been created are inserted into the application using Android Studio. Software testing is done through black box testing. Purpose: The purpose of this research is to design a machine-learning model to detect fruit freshness and then apply it to application Android smartphones. Methods/Study design/approach: The algorithm used in this study uses MobileNetV2 with transfer learning techniques. Models that have been created are inserted into the application using Android Studio. Result/Findings: The training results using MobileNetV2 transfer learning obtained an accuracy of 99.62% and the loss results obtained were 0.34%. The results of the application after testing using the black box testing method required improvements to the application and the machine learning model so that it can run optimally. Novelty/Originality/Value: Machine learning models that have been created using transfer learning MobileNetV2 are applied to Android applications so that they can be used by the public.

Hyperparameter Tuning of Long Short-Term Memory Model for Clickbait Classification in News Headlines Satriawan, Grace Yudha; Prasetiyo, Budi
Recursive Journal of Informatics Vol 2 No 1 (2024): March 2024
Publisher : Universitas Negeri Semarang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15294/rji.v2i1.71831

Abstract. The information available on the internet nowadays is diverse and moves very quickly. Information is becoming easier to obtain by the general public with the numerous online media outlets, including news portals that provide up-to-date information insights. Various news portals earn revenue from advertising using pay-per-click methods that encourage article writers to use clickbait techniques to attract visitors. However, the negative effects of clickbait include a decrease in journalism quality and the spread of hoaxes. This problem can be prevented by using text classification to classify clickbait in news titles. One method that can be used for text classification is a neural network. Artificial neural networks use algorithms that can independently adjust input coefficient weights. This makes this algorithm highly effective for modeling non-linear statistical data. The artificial neural network algorithm, especially the Long Short-Term Memory (LSTM), has been widely used in various natural language processing fields with satisfying results, including text classification. To improve the performance of the neural network model, adjustments can be made to the model's hyperparameters. Hyperparameters are parameters that cannot be obtained through data and must be defined before the training process. In this research, the Long Short-Term Memory (LSTM) model was used in clickbait classification in news titles. Sixteen neural network models were trained with different hyperparameter configurations for each model. Hyperparameter tuning was carried out using the random search algorithm. The dataset used was the CLICK-ID dataset published by William & Sari, 2020[1], with a total of 15,000 annotated data. The research results show that the developed LSTM model has a validation accuracy of 0.8030, higher than William & Sari's research, and a validation loss of 0.4876. Using this model, researchers were able to classify clickbait in news titles with fairly good accuracy. Purpose: The study was to develop and evaluate a LSTM model with hyperparameter tuning for clickbait classification on news headlines. The thesis also aims to compare the performance of simple LSTM and bidirectional LSTM for this task. Methods: This study uses CLICK-ID dataset and applies different text preprocessing techniques. The dataset later was used to build and train 16 LSTM models with different hyperparameters and evaluates them using validation accuracy and loss. This study uses random search for hyperparameter tuning. Result: The results of the study show that the best model for clickbait classification on news headlines is a bidirectional LSTM model with one layer, 64 units, 0.2 dropout rate, and 0.001 learning rate. This model achieves a validation accuracy of 0.8030 and a validation loss of 0.4876. The results also show that hyperparameter tuning using random search can improve the performance of the LSTM models by avoiding zero probabilities and finding the optimal values for the hyperparameters. Novelty: This study compares and analyzes the different preprocessing methods on text and the different configurations of the models to find the best model for clickbait classification on news headlines. The study also uses hyperparameter tuning to tune the model into the best model and finding the optimal values for the hyperparameters.

Hyperparameter Optimization Using Hyperband in Convolutional Neural Network for Image Classification of Indonesian Snacks Asyrofiyyah, Nuril; Sugiharti, Endang
Recursive Journal of Informatics Vol 2 No 1 (2024): March 2024
Publisher : Universitas Negeri Semarang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15294/rji.v2i1.72720

Abstract. Indonesia is known for its traditional food both domestically and abroad. Several cakes are included in favorite traditional foods. Of the many types of cakes that exist, it is visually easy to recognize by humans, but computer vision requires special techniques in identifying image objects to types of cakes. Therefore, to recognize objects in the form of images of cakes as one of Indonesian specialties, a deep learning algorithm technique, namely the Convolutional Neural Network (CNN) can be used. Purpose: This study aims to find out how the Convolutional Neural Network (CNN) works by optimizing the hyperband hyperparameter in the classification process and knowing the accuracy value when hyperband is applied to the optimal hyperparameter selection process for classifying Indonesian snack images. Methods/Study design/approach: This study optimizes the hyperparameter Convolutional Neural Network (CNN) using Hyperband on the Indonesian cake dataset. The dataset is 1845 images of Indonesian snacks which consists of 1523 training data, 162 validation data and 160 testing data with 8 classes. In training data, the dataset is divided by 82% on training data, 9% validation, and 9% testing. Result/Findings: The best hyperparameter value produced is 480 for the number of dense neurons 2 and 0.0001 for the learning rate. The proposed method succeeded in achieving a training value of 87.53%, for the validation process it was obtained 66.8%, the testing process was obtained 79.37%. Results obtained from model training of 50 epochs. Novelty/Originality/Value: Previous research focused on the application and development of algorithms for the classification of Indonesian snacks. Therefore, optimizing hyperparameters in a Convolutional Neural Network (CNN) using Hyperband can be an alternative in selecting the optimal architecture and hyperparameters.

Optimizing Random Forest for Predicting Thoracic Surgery Success in Lung Cancer Using Recursive Feature Elimination and GridSearchCV Putra, Deonisius Germandy Cahaya; Putra, Anggyi Trisnawan
Recursive Journal of Informatics Vol 2 No 2 (2024): September 2024
Publisher : Universitas Negeri Semarang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15294/rji.v2i2.73154

Abstract. Lung cancer is one of the deadliest forms of cancer, claiming numerous lives annually. Thoracic surgery is a strategy to manage lung cancer patients; however, it poses high risks, including potential nerve damage and fatal complications leading to mortality. Predicting the success rate of thoracic surgery for lung cancer patients can be accomplished using data mining techniques based on classification principles. Medical data mining involves employing mathematical, statistical, and computational methods. In this study, the prediction of thoracic surgery success employs the random forest algorithm with recursive feature elimination for feature selection. The feature selection process yields the top 8 features. The 8 best features include 'DGN', 'PRE4', 'PRE5', 'PRE6', 'PRE10', 'PRE14', 'PRE30', and 'AGE'. Hyperparameter using GridSearchCV is then applied to enhance classification accuracy. The results of this method implementation demonstrate a predictive accuracy of 91.41%. Purpose: The study aims to develop and evaluate a Random Forest model with a Recursive Feature Elimination feature selection and applies hyperparameter GridSearchCV for predicting thoracic surgery success rate. Methods: This study uses the thoracic surgery dataset and applies various data preprocessing techniques. The dataset is then used for classification using the Random Forest algorithm and applies the Recursive Feature Elimination feature selection to obtain the best features. GridSearchCV is used in this study for hyperparameter. Result: The accuracy using the Random Forest algorithm and Recursive Feature Elimination feature selection with hyperparameters tuning GridSearchCV resulted in an accuracy of 91,41%. The accuracy was obtained from the following parameters values: bootstrap set to false, criterion set to gini, n_estimator equal to 100, max_depth set to none, min_samples_split equal to 4, min_samples_leaf equal to 1, max_features set to auto, n_jobs set to -1, and verbose set to 2 with 10-fold cross validation. Novelty: This study comparison and analysis of various dataset preprocessing methods and different model configurations are conducted to find the best model for predicting the success rate of thoracic surgery. The study also employs feature selection to choose the best feature to be used in classification process, as well as hyperparameter tuning to achieve optimal accuracy and discover the optimal values for these hyperparameters.

Optimization of the Convolutional Neural Network Method Using Fine-Tuning for Image Classification of Eye Disease Wulandari, Vivi; Putra, Anggyi Trisnawan
Recursive Journal of Informatics Vol 2 No 1 (2024): March 2024
Publisher : Universitas Negeri Semarang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15294/rji.v2i1.73625

The eye is the most important organ of the human body which functions as the sense of sight. Most people wish they had healthy eyes so they could see clearly about life around them. However, some people experience eye health problems. There are many types of eye diseases ranging from mild to severe. With advances in technology, artificial intelligence can be used to classify eye diseases accurately, one of which is deep learning. Therefore, this study uses the Convolutional Neural Network (CNN) algorithm to classify eye diseases using the VGG16 architecture as a base model and will be combined using a fine-tuning model as an optimization to improve accuracy.

Implementation of Raita Algorithm in Manado-Indonesia Translation Application with Text Suggestion Using Levenshtein Distance Algorithm Sekartaji, Novanka Agnes; Arifudin, Riza
Recursive Journal of Informatics Vol 2 No 2 (2024): September 2024
Publisher : Universitas Negeri Semarang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15294/rji.v2i2.73651

Abstract. Manado City is one of the multidimensional and multicultural cities, possessing assets that are considered highly potential for development into tourism and development attractions. The current tourism assets being developed by the Manado City government are cultural tourism, as they hold a charm and allure for tourists. Hence, a communication tool in the form of a translation application is necessary for facilitating communication between visiting tourists and the native community of North Sulawesi, even for newcomers who intend to reside in North Sulawesi, given that the Manado language serves as the primary communication tool within the community. This research employs a combination of the Raita algorithm and the Levenshtein distance algorithm in its creation process, along with the confusion matrix method to calculate the accuracy of translation results using the Levenshtein distance algorithm with a text suggestion feature. The research begins by collecting a dataset consisting of Manado language vocabulary and their translations in Indonesia language, sourced from literature studies and original respondents from North Sulawesi, which have been validated by a validator to prevent translation data errors. The subsequent stage involves preprocessing the dataset, converting the entire content of the dataset to lowercase using the case folding process, and removing spaces at the start and end of texts using the trim function. Next, both algorithms are implemented, with the Raita algorithm serving for translation and the Levenshtein distance algorithm providing text suggestions for typing errors during the translation process. The accuracy results derived from the confusion matrix calculations during the translation process of 100 vocabulary words, accounting for typing errors, indicate that the Levenshtein distance algorithm is capable of effectively translating vocabulary accurately and correctly, even in the presence of typing errors, resulting in a high accuracy rate of 94,17%. Purpose: To determine the implementation of the Levenshtein distance and Raita algorithms in the process of using the Manado-Indonesian translation application, as well as the resulting accuracy level. Methods/Study design/approach: In this study, a combination of the Raita and Levenshtein distance algorithms is utilized in the translation application system, along with the confusion matrix method to calculate accuracy. Result/Findings: The accuracy achieved in the translation process using text suggestions from the Levenshtein distance algorithm is 94.17%. Novelty/Originality/Value: This research demonstrates that the combination of the Raita and Levenshtein distance algorithms yields optimal results in the vocabulary translation process and provides accurate outcomes from the use of effective text suggestions. This is attributed to the fact that nearly all the data used was successfully translated by the system, even in the presence of typographical errors.

Development of Digital Forensic Framework for Anti-Forensic and Profiling Using Open Source Intelligence in Cyber Crime Investigation Hakim, Muhamad Faishol; Alamsyah, Alamsyah
Recursive Journal of Informatics Vol 2 No 2 (2024): September 2024
Publisher : Universitas Negeri Semarang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15294/rji.v2i2.73731

Abstract. Cybercrime is a crime that increases every year. The development of cyber crime occurs by utilizing mobile devices such as smartphones. So it is necessary to have a scientific discipline that studies and handles cybercrime activities. Digital forensics is one of the disciplines that can be utilized in dealing with cyber crimes. One branch of digital forensic science is mobile forensics which studies forensic processes on mobile devices. However, in its development, cybercriminals also apply various techniques used to thwart the forensic investigation process. The technique used is called anti-forensics. Purpose: It is necessary to have a process or framework that can be used as a reference in handling cybercrime cases in the forensic process. This research will modify the digital forensic investigation process. The stages of digital forensic investigations carried out consist of preparation, preservation, acquisition, examination, analysis, reporting, and presentation stages. The addition of the use of Open Source Intelligence (OSINT) and toolset centralization at the analysis stage is carried out to handle anti-forensics and add information from digital evidence that has been obtained in the previous stage. Methods/Study design/approach: This research will modify the digital forensic investigation process. The stages of digital forensic investigations carried out consist of preparation, preservation, acquisition, examination, analysis, reporting, and presentation stages. The addition of the use of Open Source Intelligence (OSINT) and toolset centralization at the analysis stage is carried out to handle anti-forensics and add information from digital evidence that has been obtained in the previous stage. By testing the scenario data, the results are obtained in the form of processing additional information from the files obtained and information related to user names. Result/Findings: The result is a digital forensic phase which concern on anti-forensic identification on media files and utilizing OSINT to perform crime suspect profiling based on the evidence collected in digital forensic investigation phase. Novelty/Originality/Value: Found 3 new types of findings in the form of string data, one of which is a link, and 7 new types in the form of usernames which were not found in the use of digital forensic tools. From a total of 408 initial data and new findings with a total of 10 findings, the percentage of findings increased by 2.45%.

Home Page

OAI Link

Editorial Team

Contact

Reviewer

Google Scholar

Contact Name
-

Contact Email
rjiilkom@mail.unnes.ac.id

Phone
-

Journal Mail Official
rjiilkom@mail.unnes.ac.id

Editorial Address
D5 Building 2nd Floor, Campus Sekaran, Gunungpati, Semarang, Central Java

Location
Kota semarang,

Jawa tengah

INDONESIA

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Home Page

OAI Link

Editorial Team

Contact

Reviewer

Google Scholar

Contact Name -

Contact Email rjiilkom@mail.unnes.ac.id

Phone -

Journal Mail Official rjiilkom@mail.unnes.ac.id

Editorial Address D5 Building 2nd Floor, Campus Sekaran, Gunungpati, Semarang, Central Java

Location Kota semarang, Jawa tengah INDONESIA

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Contact Name
-

Contact Email
rjiilkom@mail.unnes.ac.id

Phone
-

Journal Mail Official
rjiilkom@mail.unnes.ac.id

Editorial Address
D5 Building 2nd Floor, Campus Sekaran, Gunungpati, Semarang, Central Java

Location
Kota semarang,

Jawa tengah

INDONESIA