cover
Contact Name
Adiwijaya
Contact Email
adiwijaya@telkomuniversity.ac.id
Phone
+6282217633999
Journal Mail Official
jdsa@telkomuniversity.ac.id
Editorial Address
Telkom University Jl. Telekomunikasi Terusan Buah Batu Indonesia, 40257, Bandung, Indonesia
Location
Kota bandung,
Jawa barat
INDONESIA
Journal of Data Science and Its Applications
Published by Universitas Telkom
ISSN : -     EISSN : 26147408     DOI : https://doi.org/10.34818/jdsa
Core Subject : Science,
JDSA welcomes all topics that are relevant to data science, computational linguistics, and information sciences. The listed topics of interest are as follows: Big Data Analytics Computational Linguistics Data Clustering and Classifications Data Mining and Data Analytics Data Visualization Information Science Tools and Applications in Data Science
Articles 30 Documents
Movie Recommendation Using Conversational Mechanism and Knowledge Based Filtering Marendra Septianta; Z. K. Abdurahman Baizal; Kemas Muslim Lhaksmana
Journal of Data Science and Its Applications Vol 3 No 2 (2020): Journal of Data Science and Its Applications
Publisher : Telkom University

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34818/jdsa.2020.3.49

Abstract

Conversational recommender system created for helping users in searching information in a domain by using conversational mechanism. These systems help user to get recommendation by selecting items that most suitable to user’s preference by asking user needed. The recommendations generated by eliciting user’s experience e.g. his favourite movies, actor and director and then gives the item that match their interest. There are many methods to get the suitable recommendation that match the user’s preference. In this paper, we use ontology which represents knowledge to get result of recommendation that fit to user preference by using knowledge-based filtering to determine the user’s need. Our system has been implemented for movie domain. We test our system performance by studying user's perception.
Cancer Detection based on Microarray Data Classification Using Principal Component Analysis and Functional Link Neural Network Iyon Priyono; Adiwijaya Adiwijaya; Annisa Aditsania
Journal of Data Science and Its Applications Vol 3 No 2 (2020): Journal of Data Science and Its Applications
Publisher : Telkom University

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34818/jdsa.2020.3.52

Abstract

Cancer is a deadly disease caused by abnormal growth of tissue cells that are not controlled in the body. In 2018, according to Globocan data, the number of cancer sufferers has increased from the previous years which was 18.1 million people, with a mortality rate of 9.6 million. In recent years, cancer prediction using DNA microarrays data can help medical experts in analyzing whether a person has cancer or not. DNA microarray data have very large and complex gene expression, therefore a dimensional reduction method is needed. Then, the dimension reduction results will be used for classification into types of cancer or not. In this paper, Principal Component Analysis (PCA) is used as a feature extraction to reduce dimension and Functional Link Neural Network as a classifier. Based on the simulation, the average of accuracy using the FLNN and PCA about 76.08%. Keywords: cancer detection, Microarray data, Functional Link Neural Network, Principal Component Analysis.
Aspect Based Sentiment Analysis on Beauty Product Review Using Random Forest Anggitha Yohana Clara; Adiwijaya Adiwijaya; Mahendra Dwifebri Purbolaksono
Journal of Data Science and Its Applications Vol 3 No 2 (2020): Journal of Data Science and Its Applications
Publisher : Telkom University

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34818/jdsa.2020.3.58

Abstract

Cosmetics and beauty products (including skincare) are the products used as body care or face care and used to accentuate the body alure. A product could give diverse sentiment to the consumers including positive and negative sentiment. Many consumers of beauty products are sharing their reviews to help other consumers to find the right products to buy and to give feedback to the brand of the beauty product itself. The number of reviews is inversely proportional to the lack of opinion identification towards product’s aspects. Hence, a study has been conducted to analyze beauty products reviews as toner, serum, sun protection, and exfoliator. The analysis process is conducted aspect based to determine sentiment towards aspect of beauty products based on the reviews. The result is addressed to people using skincare and beauty product brands in deducting consumer’s opinion. The solution to this problem is by using Random Forest with hyperparameters tuning as classification method, and TF-IDF and n-gram as feature extraction methods. The multi-aspect sentiment analysis in this study obtained highest accuracy for 90.48%, precision for 87.27%, recall for 70.13%, and F1-Score for 71.77%.
Classification of Personality based on Beauty Product Reviews Using the TF-IDF and Naïve Bayes (Case Study : Female Daily) Novia Russelia Wassi; Adiwijaya Adiwijaya; Mahendra Dwifebri Purbolaksono
Journal of Data Science and Its Applications Vol 3 No 2 (2020): Journal of Data Science and Its Applications
Publisher : Telkom University

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34818/jdsa.2020.3.61

Abstract

A person's personality is an important parameter to determine the character of each person and also as an assessment in various ways. In this day and age personality can not only be known from psychological tests, but also can be known in various ways. One way is through reviews presented in electronic media. In this study, a person's personality was classified into three "Big Five" personality groups, namely: Openness, Conscientiousness, and Extraversion using the Naïve Bayes method and TF-IDF as Feature Extraction. The results of the classification that have been done get 81% accuracy with preproccessing scenarios using Stemming and Stopword, TF-IDF unigram, and BernoulliNB classifier type.
Comparative Analysis of Support Vector Machine-Recursive Feature Elimination and Chi-Square on Microarray Classification for Cancer Detection with Naïve Bayes Talitha Kayla Amory; Adiwijaya Adiwijaya; Widi Astuti
Journal of Data Science and Its Applications Vol 3 No 2 (2020): Journal of Data Science and Its Applications
Publisher : Telkom University

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34818/jdsa.2020.3.62

Abstract

Cancer is a world-famous deadly disease. According to the World Health Organization (WHO), cancer is the second leading cause of death globally and is responsible for an estimated 9.6 million deaths in 2018. One well-known technique for cancer detection is the DNA microarray technique. DNA microarray technology provides an opportunity for researchers to analyze thousands of gene expression profiles at the same time to determine whether a person has cancer or not. However, one of the problems in DNA microarray data is the large number of features that require feature selection. In overcoming these problems, this study will use the feature selection Support Vector Machine-Recursive Feature Elimination (SVM-RFE) and Chi-Square and use the Naïve Bayes classification method. The accuracy results from using feature selection with those that are not will be compared. The accuracy between using the two feature selection methods will also be compared to find which feature selection method is better when combined with the Naïve Bayes classification method. To get an overall picture of the performance comparison, this study also considers precision, recall, and F1-score. The best accuracy results obtained were 100% lung cancer data with SVM-RFE and Chi-Square, 99.6% ovarian cancer with SVM-RFE, 93.7% breast cancer with SVM-RFE, and 90% colon cancer with SVM- RFE.
Multi Label Topic Classification for Hadith Bukhari in Indonesian Translation using Random Forest Adhitia Wiraguna; said al faraby; Adiwijaya Adiwijaya
Journal of Data Science and Its Applications Vol 4 No 1 (2021): Journal of Data Science and Its Applications
Publisher : Telkom University

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34818/jdsa.2021.4.70

Abstract

Hadith is a mandatory thing to be studied and practiced by Muslims. There are many types of teachingsthat humans can take by studying the hadith. To assist Muslims in studying the hadith, a multi labelclassification system is needed to categorize Sahih Bukhari Hadi in Indonesian translation based on threetopics, namely prohibition, advice and information. In building a text classification system, there are variousclassification methods that can be used, in this study using Random Forest (RF). The simplicity of the RFalgorithm and good ability to deal with high dimensional data, make RF a suitable method of textclassification. But, there is not widely known RF capability for the multi label classification. This study usesthe Problem Transformation approach method, namely Binary Relevance (BR) and Label Powerset (LP)to adapt RF in building a multi label classification system. The results showed that the best hamming lossperformance obtained from a system that used BR and does not use stemming which is equal to 0,0663.These results indicate that the BR method is better than the LP method in adapting the RF algorithm toperform multi label classification of hadith data. This is happened because the BR method produces aclassification model of the number of labels in the hadith data and on the other hand, the transformation ofdata from the use of LP makes the data are imbalanced.
Sentiment Analysis of Beauty Product Reviews Using the K-Nearest Neighbor (KNN) and TF-IDF Methods with Chi-Square Feature Selection Yusrifa Deta Kirana; Said Al Faraby
Journal of Data Science and Its Applications Vol 4 No 1 (2021): Journal of Data Science and Its Applications
Publisher : Telkom University

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34818/jdsa.2021.4.71

Abstract

The rise of beauty products in recent times can make consumers hesitate to choose a beauty product, especially for women. Beauty product reviews have become a very valuable source of information for consumers in making decisions to purchase a product in improving their products and marketing strategies. The process of sentiment analysis on negative and positive beauty product reviews will be classified one by one. Therefore, in this study, sentiment analysis was applied to the beauty product review data using the K-Nearest Neighbor (KNN) method to find the best k in the case of this study. The dataset used will be pre-processed with case folding, noise removal, tokenization, stemming, stopword removal, and slang words, for feature extraction using Term Frequency Inverse Document Frequency (TF-IDF) to calculate the weight of a word in the document, and The feature selection method uses Chi-Square which aims to select the features needed to increase the accuracy value. In this study, the best accuracy value was 71% of the data classified using KNN with a k value of 50 and the model on feature selection with 76 features.
Analysis Sentiment Aspect Level on Beauty Product Reviews Using Chi-Square and Naïve Bayes Felia Novitasari; Mahendra Dwifebri Purbolaksono
Journal of Data Science and Its Applications Vol 4 No 1 (2021): Journal of Data Science and Its Applications
Publisher : Telkom University

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34818/jdsa.2021.4.72

Abstract

The many platforms that are equipped with review features make it easy for people to convey anything. Product reviews are judgments that are opinions from consumers about the products they have purchased. These reviews can provide benefits for both producers and consumers. Reviews from consumers can contain ratings that cover aspects of the product and reviews can run into hundreds or even thousands. A large number of reviews makes it difficult in the sentiment analysis process. Therefore we need a model that can analyze sentiment based on aspects of the product. Sentiment analysis was performed using the naive Bayes algorithm, feature extraction with TF-IDF, and feature selection with chi-square. The application of stopwords removal or stemming processesreprocessing and the use of n-grams in feature extraction can affect the resulting performance. In addition, the application of feature selection to the built model has an important role because it can improve classification performance. From the research results obtained the best accuracy of 80,18%, recall of 72,49%, precision of 77,25%, and f1-score of 74,73%.
Analysis of Indonesian People's Sentiments About the Side Effects of the COVID-19 Vaccine on Twitter Fajar Fatur Rachman; Setia Pramana
Journal of Data Science and Its Applications Vol 4 No 1 (2021): Journal of Data Science and Its Applications
Publisher : Telkom University

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34818/jdsa.2021.4.73

Abstract

The large number of people who refuse to get vaccines is one of the biggest challenges for the Indonesian government in dealing with the COVID-19 pandemic. Widespread misinformation and hoaxes about COVID-19 vaccines has made the level of public trust decreases. This paper was written to see how the public's opinion on the side effects of the three COVID-19 vaccines that have been spreading in Indonesia, among them are Sinovac, Astra Zeneca and Moderna. There will be a sentiment analysis and grouping of public conversations on Twitter with LDA regarding the side effects of the three vaccines. From the results obtained, it is expected to be a reference for the government or related parties in order to validate the issues that circulated among the society regarding the side effects of the COVID-19 vaccine. From the results of the analysis, it was found that in the Sinovac vaccine type, people tend to state that the side effects felt are quite mild, dominated by the words sleepy, achy & hungry. While for the Astra Zeneca & Moderna vaccine, people tend to state that the side effects are quite severe, such as fever, pain, and dizziness. The results of the analysis also found that the Astra Zeneca vaccine was the type of vaccine that received the most negative opinions from the public.
Implementation of Enhance Confix Stripping Stemmer Algorithm for Multiclass Dataset Classification in News Text using K-Nearest Neighbor Alvianda Ricky Lukman; Widi Astuti
Journal of Data Science and Its Applications Vol 4 No 1 (2021): Journal of Data Science and Its Applications
Publisher : Telkom University

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34818/jdsa.2021.4.76

Abstract

Needs for news information has increased since the change from physical media to online media. News is grouped according to categories to making it easier for readers to get the news as desired. Grouping to determine the category of news information is known as text classification. The number of words in the news text create diversity of words that appear and can be minimized by the stemming process, which is changing an affixed word into its root word. This study comparing between use of stemming and without stemming and finding the best value of K and optimum distance calculation of K-Nearest Neighbor. The best accuracy is 0.9671 which is obtained when stemming algorithm not applied, number of K=9 and cosine distance is used as distance metric. This result is greater than the classification that applies stemming algorithm in condition K=7 using cosine distance which resulted accuracy in 0.9660.

Page 3 of 3 | Total Record : 30