Claim Missing Document
Check
Articles

Found 39 Documents
Search

Building Synonym Set for Indonesian WordNet using Commutative Method and Hierarchical Clustering Valentino Rossi Fierdaus; Moch Arif Bijaksana; Widi Astuti
JURNAL MEDIA INFORMATIKA BUDIDARMA Vol 4, No 3 (2020): Juli 2020
Publisher : STMIK Budi Darma

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30865/mib.v4i3.2254

Abstract

WordNet is a compilation of Synonyms Set (synset), which consists of the words that have the same synonymous. The development of Indonesian WordNet has a goal to build an application that can accommodate and exhibit the relation of words. Synonym Set is a set composed of one or more words that have a similar meaning or synonym relation originated from the Indonesian Thesaurus. In previous studies, the establishment of synsets were transmitted with several approaches, one of which was the cluster ring to produce synsets and WSD (Word Sense Disambiguation). In this research, research is held up to discover the semantic similarities between words in the Indonesian Thesaurus automatically, and also to know the performance of the Agglomerative Hierarchical Clustering method for the development of Indonesian synsets. To calculate performance and evaluation, this research is using the F-measure method involving the gold standard
Principal Component Analysis Sebagai Ekstraksi Fitur Data Microarray Untuk Deteksi Kanker Berbasis Linear Discriminant Analysis Widi Astuti; Adiwijaya Adiwijaya
JURNAL MEDIA INFORMATIKA BUDIDARMA Vol 3, No 2 (2019): April 2019
Publisher : STMIK Budi Darma

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30865/mib.v3i2.1161

Abstract

Cancer is one of the leading causes of death globally. Early detection of cancer allows better treatment for patients. One method to detect cancer is using microarray data classification. However, microarray data has high dimensions which complicates the classification process. Linear Discriminant Analysis is a classification technique which is easy to implement and has good accuracy. However, Linear Discriminant Analysis has difficulty in handling high dimensional data. Therefore, Principal Component Analysis, a feature extraction technique is used to optimize Linear Discriminant Analysis performance. Based on the results of the study, it was found that usage of Principal Component Analysis increases the accuracy of up to 29.04% and f-1 score by 64.28% for colon cancer data.
Deteksi Kanker Berdasarkan Data Microarray Menggunakan Metode Naïve Bayes dan Hybrid Feature Selection Bintang Peryoga; Adiwijaya Adiwijaya; Widi Astuti
JURNAL MEDIA INFORMATIKA BUDIDARMA Vol 4, No 3 (2020): Juli 2020
Publisher : STMIK Budi Darma

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30865/mib.v4i3.2096

Abstract

Cancer is a deadly disease that is responsible for 9.6 million death in 2018 based on WHO data so early cancer detection is needed so can be treated immediately and cancer deaths can be reduced. Microarray is technology that can monitor and analyze the expression of cancer genes in microarray data but has high data dimension and small sample so dimensional reductions are needed for the optimal classification process. Dimension reduction can reduce the use of features for the classification process by selecting some influential features. Hybrid method is one dimension reduction by combining Filter method with Wrapper so it gets the both advantage. In this case, researchers combined Naïve Bayes with Hybrid Feature Selection (Information Gain - Genetic Algorithm) on cancer data for microarray Lung Cancer, Ovarian Cancer, Breast Cancer, Colon Tumors, and Prostate Tumors. These data were obtained from Kent-Ridge Biomedical Dataset. The results showed that from 5 data used, 4 data obtained an accuracy between 87-100% while the prostate tumor data obtained the smallest accuracy of 61.14%. The implementation of the feature selection method and the classification of the 5 cancer data above only uses less than 63 features to obtain this accuracy
Analisis Perbandingan Klasifikasi Support Vector Machine (SVM) dan K-Nearest Neighbors (KNN) untuk Deteksi Kanker dengan Data Microarray Shidqi Aqil Naufal; Adiwijaya Adiwijaya; Widi Astuti
JURIKOM (Jurnal Riset Komputer) Vol 7, No 1 (2020): Februari 2020
Publisher : STMIK Budi Darma

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (308.658 KB) | DOI: 10.30865/jurikom.v7i1.2014

Abstract

Cancer is a disease that can cause human death in various countries. According to WHO in 2018, cancer causes 9.6 million human deaths worldwide. Globally, about 1 in 6 deaths is due to cancer. Therefore, we need a technology that can be used for cancer detection with high acuration so that cancer can be detected early. Microarrays technique can predict certain tissues in humans and can be classified as cancer or not. However, microarray data has a problem with very large dimensions. To overcome this problem, in this study use one of the dimension reduction techniques, namely Partial Least Square(PLS) and use Support vector Machine (SVM) and K-Nearest Neighbors as a classification method, which will be used to compare which is better.The system built was able to reach 98.54% in leukemia data with PLS-KNN, 100% in lung data with KNN, 66.52% in breast data with PLS-KNN, and 85.60% in colon data with PLS- SVM. KNN is able to get the best in three data from four valued data.
Twitter Sentiment Analysis on Online Transportation in Indonesia Using Ensemble Stacking Yahya Setiawan; Jondri Jondri; Widi Astuti
JURNAL MEDIA INFORMATIKA BUDIDARMA Vol 6, No 3 (2022): Juli 2022
Publisher : STMIK Budi Darma

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30865/mib.v6i3.4359

Abstract

Online transportation is a transportation innovation that has emerged along with the development of online-based applications that provide many features and conveniences. In its development, many users wrote their responses to the application on social media such as twitter. Many opinions and responses are directly conveyed by users of online transportation modes to their official accounts. The responses given by these users are very large and can be used as sentiment analysis on online transportation. However, the analysis process cannot be done manually. Therefore, we need a system that can help analyze user responses on Twitter automatically. In this study, a sentiment analysis system was built for online transportation in Indonesia using the ensemble stacking algorithm, which will simplify and increase the accuracy of the sentiment analysis. Ensemble stacking is a solution for advanced machine learning methods that can improve the performance of the base classifier. The system built on ensemble stacking uses three base classifiers, namely SVM kernel RBF, SVM linear kernel, and logistic regression. The best accuracy result on the gojek dataset is 88%, and the best F1 score is 87%. Ensemble Stacking which is applied to the research that the author conducted on online transportation sentiment analysis on twitter, obtained better accuracy than the base classifier used.
Sentiment and Discussion Topic Analysis on Social Media Group using Support Vector Machine Salsabila Putri Adityani; Donni Richasdy; Widi Astuti
JURNAL MEDIA INFORMATIKA BUDIDARMA Vol 6, No 3 (2022): Juli 2022
Publisher : STMIK Budi Darma

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30865/mib.v6i3.4233

Abstract

The growth of social media in this modern era is increasingly rapid, where people are very active digitally interacting with each other. People who have a common interest or simply like to be in a community often gather in an online group, especially on Facebook. Alumni of Telkom University are no exception, who are also actively discussing and sharing information in Telkom University Alumni Forum Facebook group (FAST). By using their status from that group, sentiment and topic discussion analysis can be performed to determine whether the polarity is positive, neutral, or negative. In Addition, topic modeling extracts what topics are often discussed in the group. In this research, sentiment analysis was performed using the Support Vector Machine (SVM) method. Also, the classification process involved TF-IDF for word weighting and confusion matrix as performance measurement. Several testing scenarios were carried out to get the best accuracy value. Based on the tests performed on the preprocessing technique and feature extraction n-gram addition, the highest accuracy value obtained is 80.56%. The result indicates that the best performance is obtained by combining preprocessing techniques without the stopword removal process and feature extraction unigram. Moreover, the topics discussed based on topic modeling results were related to telecommunication and Telkom, Indonesia, alumni, and FAST.
Partner Sentiment Analysis for Telkom University on Twitter Social Media Using Decision Tree (CART) Algorithm Sean Akbar Ryanto; Donni Richasdy; Widi Astuti
JURNAL MEDIA INFORMATIKA BUDIDARMA Vol 6, No 4 (2022): Oktober 2022
Publisher : Universitas Budi Darma

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30865/mib.v6i4.4533

Abstract

Sentiment analysis is an analysis in terms of opinion and meaning in the form of writing. Sentiment analysis is very useful for expressing opinions from any individual or group to improve branding.  Branding is a process to promote and improve the name of a brand or brands to attract the attention of consumers to be interested in trying the services of a company that runs in academic terms such as Telkom University. However, this requires cooperation between other associations as partners so that the branding carried out can be effective. One form of cooperation is by providing opinions about Telkom University so that consumers are more familiar with Telkom University on Twitter social media which is the largest social media used by many people because it can provide any opinion freely. Therefore, this study aims to analyze the sentiment submitted by partners for Telkom University on Twitter which is the main factor for promoting themselves to consumers. The process carried out is to take all tweets about Telkom University submitted by partners and then carry out the TF-IDF weighting process and classified using the Decision Tree CART algorithm based on positive, negative, and neutral sentiment categories. The best results obtained by the Decision Tree model of the CART algorithm are the Accuracy value of 86.73%, Precision of 87.06%, Recall of 87.55%, and F1-Score of 86.52%.
Analysis of Community Sentiment on Twitter towards COVID-19 Vaccine Booster Using Ensemble Stacking Methods Syifa Khairunnisa Salsabila; Jondri Jondri; Widi Astuti
Building of Informatics, Technology and Science (BITS) Vol 4 No 2 (2022): September 2022
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v4i2.1902

Abstract

The outbreak of the COVID-19 virus in Indonesia has not ended until the government has made various efforts to reduce this outbreak, such as the Large-Scale Social Restriction (PSBB) policy and the obligation of the entire community to vaccinate against COVID-19. The government has made a new policy for the community: booster vaccination for people who have already been vaccinated against COVID-19 1 and vaccinated against COVID-19 2. With this new policy, many people have given opinions on social media. One of them is Twitter social media. Positive and negative opinions given by Twitter users can be used as a source of information data. Because of these problems, researchers conducted a sentiment analysis of the booster vaccine using the Ensemble Stacking method. The dataset that has collected as many as 6,500 data from Twitter will be grouped into positive and negative class sentiments. The best results from this study using ensemble stacking and oversampling have an accuracy value of 80%.
Sentiment Analysis on Twitter Against IndiHome Providers Using Chi-Square and Ensemble Bagging Methods Anisa Nur Aini; Jondri Jondri; Widi Astuti
Building of Informatics, Technology and Science (BITS) Vol 4 No 2 (2022): September 2022
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v4i2.1967

Abstract

During the Covid-19 pandemic, internet usage has increased rapidly. Now the internet is used as a means in the online teaching and learning process and work from home. One of the internet service providers is IndiHome. IndiHome is an internet service provider company that has a huge number of users. A large number of IndiHome users causes frequent problems, and this is one of the factors that IndiHome users provide various kinds of opinions or responses. Sentiment analysis is used to see the opinion or opinion given by someone on a particular object or problem. This study conducted a sentiment analysis using the Chi-square and the Ensemble Bagging method with three base classifier methods, namely K-Nearest Neighbor (K-NN), Support Vector Machine (SVM), and Naive Bayes (NB). Prediction results on labels obtained from each base classifier are combined using a hard majority vote. Tweet data collection was carried out in March 2022, and 6,962 tweets were collected. This study conducted two test scenarios. Scenario 1 is a scenario without oversampling with test results showing that Ensemble Bagging has the highest accuracy value of 83.32%, and in scenario 1 with hyperparameter tuning, Ensemble Bagging has the highest accuracy value of 83.93%. Scenario 2 is a scenario with oversampling, showing that Ensemble Bagging has the highest accuracy value of 84.51%, and scenario 2 with hyperparameter tuning also shows Ensemble Bagging has the highest accuracy value of 84.56%.
Sentiment Analysis Against IndiHome and First Media Internet Providers Using Ensemble Stacking Method Arya Rafif Muhammad Fikri; Jondri Jondri; Widi Astuti
Building of Informatics, Technology and Science (BITS) Vol 4 No 2 (2022): September 2022
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v4i2.1969

Abstract

Customer satisfaction is one of the factors that can be used to measure the success of service in a company. In the era of the 2000s until now, internet service providers have continued to grow throughout the world, including in Indonesia. IndiHome and First Media are companies that provide internet services that make it easy for the public to communicate and obtain information. With many uses of IndiHome and First Media internet services, there are often several obstacles that cause various responses from users. Users usually channel these responses to IndiHome or First Media customer care on Twitter. The dataset for this study was obtained from Twitter using the Twitter API and the Tweepy library. The dataset that has been collected is 6.962 tweets for the IndiHome dataset and 8,089 tweets for the First Media dataset. This study conducts sentiment analysis using the Ensemble Stacking with three base classifiers and a meta classifier. The base classifier used is Naïve Bayes, K-Nearest Neighbor, and Decision Tree, while the meta classifier used is Logistic Regression. This study uses the term frequency-inverse document frequency (TF-IDF) to determine the frequency value of a word in a document. This study uses two test scenarios: testing without oversampling and testing with oversampling on the dataset. The results show that Ensemble Stacking with term frequency-inverse document frequency feature extraction produces the highest accuracy, with an accuracy value of 88.27% on the IndiHome dataset and 92.56% on the First Media dataset by oversampling on both datasets.