Claim Missing Document
Check
Articles

Found 27 Documents
Search

Effects of using wordnet and spelling checker on classification methods in sentiment analysis for datasets using Bahasa Andika, Rizky; Suharjito, Suharjito
Indonesian Journal of Electrical Engineering and Computer Science Vol 25, No 3: March 2022
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijeecs.v25.i3.pp1662-1671

Abstract

Sentiment analysis was a system for recognizing and extracting opinions in documents. There were two weaknesses in sentiment analysis. The first weakness was preprocessing in sentiment analysis can’t recognize slang words so that important words that should have been recognized became unrecognizable. The Second was the feature extraction process in sentiment analysis only recognized words based on the form of the word but can’t recognize the similar word. In this paper, we proposed spelling checker and wordnet to fix these weaknesses. We also used k-nearest neighbor (KNN), Naïve Bayes, and decision tree as methods for check classify the text. The purpose of this research was first to know the effects of used Wordnet and spelling checkers in sentiment analysis and second was to improve the data processing process in the existing sentiment analysis. The dataset that we used in the research was a list of tweets in Bahasa. The results showed wordnet and spelling checker make a decrease in the valued of false positives, false negatives, and true negatives in the calculation of the confusion matrix. It can increase the accuracy of the K-NN from 43% to 72%, Naïve Bayes from 41% to 71% and decision tree from 47% to 75%.
Enhancing Predictive Accuracy for Differentiated Thyroid Cancer (DTC) Recurrence Through Advanced Data Mining Techniques Sibarani, Imelda Juliana Br.; S, Katherina Meylda Loy; Suharjito, Suharjito
TIN: Terapan Informatika Nusantara Vol 5 No 1 (2024): June 2024
Publisher : Forum Kerjasama Pendidikan Tinggi (FKPT)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/tin.v5i1.5237

Abstract

Thyroid cancer is becoming more common, and its 20% recurrence rate of which almost half are discovered more than five years after surgery, highlights how difficult it is to distinguish between a true disease relapse and chronic disease brought on by insufficient initial treatment. This ambiguity highlights the complicated dynamics that drive the mortality rates in patients with thyroid cancer. The purpose of this study is to be refining these predictions to control Differentiated Thyroid Cancer recurrence and minimize the risk of recurrence. The dataset was obtained by monitoring a total of 383 patients with 17 attributes. This study adopted a data mining modelling strategy to evaluate the performance, classification accuracy, and cluster distribution, utilizing the Orange data mining software. The Exploratory Data Analysis was conducted to pinpoint the most significant contributors. Subsequently, a variety of supervised techniques were applied to assess the precision of both single and ensemble models in classification. For cluster determination, we implemented several unsupervised learning techniques, including k-means, hierarchical, and Louvain Clustering. The result shows that ensemble stacking algorithm demonstrated superior performance and classification accuracy, achieving impressive scores of 0.971. The analysis of clustering methods, notably k-means and hierarchical clustering, suggested that the dataset could be segmented into two distinct clusters. The most dominant factors in influencing the recurrence of thyroid cancer with strong correlation revealed 'Response', 'Risk', 'Adenopathy', and 'N'. The refinement of the diagnostic model, through the identification of accurate models and key factors, enhances the prediction of Differentiated Thyroid Cancer recurrence.
Flood Prediction based on Weather Parameters in Jakarta using K-Nearest Neighbours Algorithm Lumbantobing, Hariman; Ratna Avianti, Irma; Harisapto, Kukuh; Suharjito, Suharjito
Eduvest - Journal of Universal Studies Vol. 4 No. 6 (2024): Journal Eduvest - Journal of Universal Studies
Publisher : Green Publisher Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.59188/eduvest.v4i6.1339

Abstract

Flooding is a difficult and common hazard in Indonesia, particularly in Jakarta during the rainy season. Floods have been the subject of several endeavours, ranging from discovering the causes to reducing their impacts. Floods cause significant damage to infrastructure, the social economy, and human lives. The government continues to create reliable flood risk maps and plans for long-term flood risk management. According to data from Jakarta Flood Monitoring, 12 sub-districts and 26 urban villages were hit by floods each year between 2016 and 2020, with an average flood length of nearly 2 days. The flood tendency in Jakarta decreased from 2018 to 2019, but increased in 2020. Floods are produced by a variety of reasons, including weather, geography, and human actions such as deforestation. Strong flood prediction is required for disaster management, however this might be difficult owing to changing weather conditions. This study focuses on flood prediction in Jakarta based on weather parameters utilising machine learning techniques to provide accurate and real-time predictions. K-Nearest Neighbours (KNN) is an algorithm employed to forecast the areas that will encounter the consequences of floods. The outcomes of this research with the value of k=2 to k=9 obtained the best performance values at k=7, where the level of accuracy reaches 92.25%, 88.89% precision, 92.25% recall, and F1-measure of 89.52%. The integration of machine learning algorithms which encompasses multiple weather variables provides significant utility in comprehensive flood predictions and early warning systems in flood disaster mitigation.
Multi-classification Sentiment Analysis using Convolution Neural Network and Long-Short Term Memory with Attention Model Christianto, Yohanes; Suharjito, Suharjito
InComTech : Jurnal Telekomunikasi dan Komputer Vol 13, No 3 (2023)
Publisher : Department of Electrical Engineering

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.22441/incomtech.v13i3.13812

Abstract

Multi-classification Sentiment Analysis from sentences in Bahasa is a challenging process due to problems in slang, local language combined with many English words. Current state-of-art methods rely on feature extraction using unsupervised treatment. A research to solve this problem was conducted using LSTM and CNN that are capable of learning complex features from the lower level. The objective of this study was to investigate the results of the sentiment analysis based on the extraction of aspects that were carried out with attention models and several deep learning methods. Research data was collected from Zomato comments in Bahasa for any Indonesian restaurants. The data was annotated manually based on four subjects namely place, taste, location, and service. The result of this study showed that Bi-LSTM with attention model and CNN without attention model had the best performance compared to other methods, while CNN without attention model for sentiment analysis using deep learning showed the best accuracy.
Analisa Penurunan Frekuensi Breakdown KOMATSU WA800-3 Akibat Fuel System denganMengaplikasikan Metode FMEA Andriyansyah, Andriyansyah; Suharjito, Suharjito; Rimantho, Dino
Innovative: Journal Of Social Science Research Vol. 4 No. 3 (2024): Innovative: Journal Of Social Science Research
Publisher : Universitas Pahlawan Tuanku Tambusai

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.31004/innovative.v4i3.10793

Abstract

Wheel Loader merupakan salah satu jenis alat berat yang dimiliki oleh PT XYZ. Dan wheel loader pun unit paling prioritas yang dimiliki oleh PT XYZ. Penelitian dilakukan pada unit Komatsu WA800-3 yang memiliki masalah tertinggi pada fuel system. Penelitian dilakukan untuk mengidentifikasi faktor – faktor terjadinya breakdown fuel system dengan menggunakan metode Fishbone Diagram, besarnya potensi kegagalan dan efeknya dianalisis menggunakan metode Failure Mode and Effect Analysis (FMEA) dan rekomendasi faktor – faktor utama yang direkomendasikan untuk dilaksanakanya perbaikan menggunakan metode 5W + 1H. Hasil penelitian ini menunjukkan nilai RPN tertinggi sebesar 360 pada cover guar radiator yang berongga. Selain itu, terjadi penurunan frekuensi breakdown yang diakibatkan oleh fuel system sebesar 90% atau dalam kata lain terjadi peningkatan performa unit. Perusahaan agar dapat mengimplementasikan keseluruh rencana perbaikan yang telah dibuat dan dapat konsisten melakukan kegiatan yang telah dijadikan perbaikan.
Identifikasi dan Evaluasi Risiko Manajemen Rantai Pasok Komoditas Jagung dengan Pendekatan Logika Fuzzy Suharjito, Suharjito; Marimin, Marimin; Machfud, Machfud; Haryanto, Bambang; Sukardi, Sukardi
Jurnal Manajemen dan Organisasi Vol. 1 No. 2 (2010): Jurnal Manajemen dan Organisasi
Publisher : IPB University

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (542.692 KB) | DOI: 10.29244/jmo.v1i2.14157

Abstract

To meet the needs of the national corn on the feed industry that requires a continuous supply of raw materials with a definite quantity throughout the year, in the national maize production conditions that is not continuous and fluctuating, it is necessary supply planning and storage methods to avoid the risk of maize corn supply crisis form of food shortages or rising feed prices. One method is to apply the concept of supply chain risk management. The high level of dependence and complexity of networks makes supply chain of agricultural products supply chain is becoming more vulnerable to interference. The risk of supply chain disturbance can occur internally (the relation between the organization with a network of suppliers) and external (between network suppliers with the environment). Therefore it is necessary to identify and evaluate supply chain risks in order to avoid continuing problems that can occur at any point in the supply chain network. The objective of this study is to describe the model of identification and evaluation for maize supply chain risk. This model can be used to identify the dominant risk factors and variables at each level of supply chain so that it will be obtained appropriate recommendations action to anticipate them. Risk identification will be conducted with fuzzy AHP approach and risk evaluation would be done by using fuzzy logic with data input form the opinion of several experts maize supply chain.
Predictive Maintenance Using Linear Regression Prayogo, Rudy Hartono; Kurnianto, Benedict Ariel; Nababan, Nidia Pialina; Suharjito, Suharjito
Syntax Literate Jurnal Ilmiah Indonesia
Publisher : Syntax Corporation

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.36418/syntax-literate.v9i3.15366

Abstract

Problems regarding machine damage often occur in many industries, especially the manufacturing industry, which causes large losses for companies. This is of course influenced by various factors such as engine temperatures that are too high, engine rotation that is too fast, poor engine torque values, and so on. This research aims to provide predictive analysis results regarding engine conditions that have the potential to experience damage. To achieve this goal, this research will carry out predictive maintenance analysis using a linear regression analysis approach in which two linear regression models will be carried out where the first model involves PCA preprocessing and the second model is carried out without PCA. This research will use the predictive maintenance dataset from the conference (Matzka, 2020). It is known that the MSE, RMSE, MAE, and R2 values of the two methods have the same values, namely 0.909, 0.953, 0.806, and 0.772 respectively. Based on this research, it is concluded that whether PCA is performed or not, it does not significantly affect the results of the regression analysis. This outcome can be attributed to the artificial nature of the dataset, rendering it ideal. Moreover, the retained PCA value of 98% is close to the number of attributes in the original dataset.