cover
Contact Name
Aji Prasetya Wibawa
Contact Email
keds.journal@um.ac.id
Phone
+62818539333
Journal Mail Official
keds.journal@um.ac.id
Editorial Address
Semarang St. No 5, Malang, Indonesia
Location
Kota malang,
Jawa timur
INDONESIA
Knowledge Engineering and Data Science
ISSN : -     EISSN : 25974637     DOI : https://doi.org/10.17977
Knowledge Engineering and Data Science (2597-4637), KEDS, brings together researchers, industry practitioners, and potential users, to promote collaborations, exchange ideas and practices, discuss new opportunities, and investigate analytics frameworks on data-driven and knowledge base systems.
Articles 98 Documents
A Comparison of Machine Learning Models to Prioritise Emails using Emotion Analysis for Customer Service Excellence Mohammad Yasser Chuttur; Yashinee Parianen
Knowledge Engineering and Data Science Vol 5, No 1 (2022)
Publisher : Universitas Negeri Malang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.17977/um018v5i12022p41-52

Abstract

There has been little research on machine learning for email prioritization for customer service excellence. To fill this gap, we propose and assess the efficacy of various machine learning techniques for classifying emails into three degrees of priority: high, low, and neutral, based on the emotions inherent in the email content. It is predicted that after emails are classified into those three categories, recipients will be able to respond to emails more efficiently and provide better customer service. We use the NRC Emotion Lexicon to construct a labeled email dataset of 517,401 messages for our proposal. Following that, we train and test four prominent machine learning models, MNB, SVM, LogR, and RF, and an Ensemble of MNB, LSVC, and RF classifiers, on the labeled dataset. Our main findings suggest that machine learning may be used to classify emails based on their emotional content. However, some models outperform others. During the testing phase, we also discovered that the LogR and LSVC models performed the best, with an accuracy of 72%, while the MNB classifier performed the poorest. Furthermore, classification performance differed depending on whether the dataset was balanced or imbalanced. We conclude that machine learning models that employ emotions for email classification are a promising avenue that should be explored further.
The Effect of Resampling on Classifier Performance: an Empirical Study Utomo Pujianto; Muhammad Iqbal Akbar; Niendhitta Tamia Lassela; Deni Sutaji
Knowledge Engineering and Data Science Vol 5, No 1 (2022)
Publisher : Universitas Negeri Malang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.17977/um018v5i12022p87-100

Abstract

An imbalanced class on a dataset is a common classification problem. The effect of using imbalanced class datasets can cause a decrease in the performance of the classifier. Resampling is one of the solutions to this problem. This study used 100 datasets from 3 websites: UCI Machine Learning, Kaggle, and OpenML. Each dataset will go through 3 processing stages: the resampling process, the classification process, and the significance testing process between performance evaluation values of the combination of classifier and the resampling using paired t-test. The resampling used in the process is Random Undersampling, Random Oversampling, and SMOTE. The classifier used in the classification process is Naïve Bayes Classifier, Decision Tree, and Neural Network. The classification results in accuracy, precision, recall, and f-measure values are tested using paired t-tests to determine the significance of the classifier's performance from datasets that were not resampled and those that had applied the resampling. The paired t-test is also used to find a combination between the classifier and the resampling that gives significant results. This study obtained two results. The first result is that resampling on imbalanced class datasets can substantially affect the classifier's performance more than the classifier's performance from datasets that are not applied the resampling technique. The second result is that combining the Neural Network Algorithm without the resampling provides significance based on the accuracy value. Combining the Neural Network Algorithm with the SMOTE technique provides significant performance based on the amount of precision, recall, and f-measure.
Adaptive Neuro-Fuzzy Inference System for Waste Prediction Haviluddin Haviluddin; Herman Santoso Pakpahan; Novianti Puspitasari; Gubtha Mahendra Putra; Rima Yustika Hasnida; Rayner Alfred
Knowledge Engineering and Data Science Vol 5, No 2 (2022)
Publisher : Universitas Negeri Malang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.17977/um018v5i22022p122-128

Abstract

The volume of landfills that are increasingly piled up and not handled properly will have a negative impact, such as a decrease in public health. Therefore, predicting the volume of landfills with a high degree of accuracy is needed as a reference for government agencies and the community in making future policies. This study aims to analyze the accuracy of the Adaptive Neuro-Fuzzy Inference System (ANFIS) method. The prediction results' accuracy level is measured by the value of the Mean Absolute Percentage Error (MAPE). The final results of this study were obtained from the best MAPE test results. The best predictive results for the ANFIS method were obtained by MAPE of 3.36% with a data ratio of 6:1 in the North Samarinda District. The study results show that the ANFIS algorithm can be used as an alternative forecasting method.
Fish Image Classification Using Adaptive Learning Rate In Transfer Learning Method Rizka Suhana; Wayan Firdaus Mahmudy; Agung Setia Budi
Knowledge Engineering and Data Science Vol 5, No 1 (2022)
Publisher : Universitas Negeri Malang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.17977/um018v5i12022p67-77

Abstract

The existence of fish species diversity in coastal ecosystems which include mangrove forests, seagrass beds and coral reefs is one of the benchmarks in determining health in coastal ecosystems. It is certain that we must maintain, preserve and care for so that conservation efforts need to be carried out in water areas. Many experts at the Indonesian Fisheries and Marine Research and Development Agency often classify fish images manually, of course it will take a long time, therefore with today's developments they can use the latest technology.  One of the reliable techniques in terms of image classification is Convolutional Neural Network (CNN). As time goes by, of course, many people want fast learning and solving new problems faster and better, so transfer learning appears, which adopts part of CNN, the name is modified convolution layer. Observing the needs of experts in the field of marine conservation, the researchers decided to solve this problem by using transfer learning modifications. The transfer learning used is an architectural model from the pre-trained Mobilenet V2, which is known for its light computing process and can be applied to our gadgets and other embedded tools. The research image data used is 49.281 data of various sizes and there are 18 types of fish, in the pre-processing data there is a resize of the image to a size of 224x224 pixels. testing with the modified transfer learning architectural model obtained an accuracy score of 99.54%, this model is quite reliable in classifying fish images.
An Accurate Real-Time Method for Face Mask Detection using CNN and SVM Shili Hechmi
Knowledge Engineering and Data Science Vol 5, No 2 (2022)
Publisher : Universitas Negeri Malang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.17977/um018v5i22022p129-136

Abstract

Infectious respiratory diseases, including COVID-19, pose a significant challenge to humanity and a potential threat to life due to their severity and rapid spread. Using a surgical mask is among the most significant safety precautions that can help keep this sort of pandemic from spreading, and manual monitoring of large crowds in public places for face masks is problematic. In this research, we suggest a real-time approach for face mask detection. First, we use a multi-scale deep neural network to extract features. As a result, the attributes are better suited for training the detection system. We employ SVM post-processing in the classification stage to make the face mask detection method more robust. According to the experimental findings, our strategy considerably decreased the percentage of false positives and undetected cases.
Human Facial Expressions Identification using Convolutional Neural Network with VGG16 Architecture Luther Alexander Latumakulita; Sandy Laurentius Lumintang; Deiby Tineke Salakia; Steven R. Sentinuwo; Alwin Melkie Sambul; Noorul Islam
Knowledge Engineering and Data Science Vol 5, No 1 (2022)
Publisher : Universitas Negeri Malang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.17977/um018v5i12022p78-86

Abstract

The human facial expression identification system is essential in developing human interaction and technology. The development of Artificial Intelligence for monitoring human emotions can be helpful in the workplace. Commonly, there are six basic human expressions, namely anger, disgust, fear, happiness, sadness, and surprise, that the system can identify. This study aims to create a facial expression identification system based on basic human expressions using the Convolutional Neural Network (CNN) with a 16-layer VGG architecture. Two thousand one hundred thirty-seven facial expression images were selected from the FER2013, JAFFE, and MUG datasets. By implementing image augmentation and setting up the network parameters to Epoch of 100, the learning rate of 0,0001, and applying in the 5Fold Cross Validation, this system shows performance with an average accuracy of 84%. Results show that the model is suitable for identifying the basic facial expressions of humans.
Performance of Ensemble Classification for Agricultural and Biological Science Journals with Scopus Index Nastiti Susetyo Fanany Putri; Aji Prasetya Wibawa; Harits Ar Rosyid; Agung Bella Putra Utama; Wako Uriu
Knowledge Engineering and Data Science Vol 5, No 2 (2022)
Publisher : Universitas Negeri Malang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.17977/um018v5i22022p137-142

Abstract

The ensemble method is considered an advanced method in both prediction and classification. The application of this method is estimated to have a more optimal output than the previous classification method. This article aims to determine the ensemble's performance to classify journal quartiles. The subject of agriculture was chosen because Indonesia is an agricultural country, and the interest of researchers in this field shows a positive response. The data is downloaded through the Scimago Journal and Country Rank with the accumulation in 2020. Labels have four classes: Q1, Q2, Q3, and Q4. The ensemble applied is Boosting and Bagging with Decision Tree (DT) and Gaussian Naïve Bayes (GNB) algorithms compiled from 2144 instances. The Boosting meta-ensembles used are Adaboost and XGBoost. From this study, the Bagging Decision Tree has the highest accuracy score at 71.36, followed by XGBoost Decision Tree with 69.51. The third is XGBoost Gaussian Naïve Bayes with 68.82, Adaboost Decision Tree with 60.42, Adaboost Gaussian Naïve Bayes with 58.2, and Bagging Gaussian Naïve Bayes with 56.12 results. This paper shows that the Bagging Decision Tree is the ensemble method that works optimally in this subject classification. This result suggests that the ensemble method can still fail to produce an ideal outcome that approaches the SJR system.
Sentiment Analysis of Amazon Product Reviews using Supervised Machine Learning Techniques Naveed Sultan
Knowledge Engineering and Data Science Vol 5, No 1 (2022)
Publisher : Universitas Negeri Malang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.17977/um018v5i12022p101-108

Abstract

Today, everything is sold online, and many individuals can post reviews about different products to show feedback. Serves as feedback for businesses regarding buyer reviews, performance, product quality, and seller service. The project focuses on buyer opinions based on Mobile Phone reviews. Sentiment analysis is the function of analyzing all these data, obtaining opinions about these products and services that classify them as positive, negative, or neutral. This insight can help companies improve their products and help potential buyers make the right decisions. Once the preprocessing is classified on a trained dataset, these reviews must be preprocessed to remove unwanted data such as stop words, verbs, pos tagging, punctuation, and attachments. Many techniques are present to perform such tasks, but in this article, we will use a model that will use different inspection machine techniques.
Social Media Mining with Fuzzy Text Matching: A Knowledge Extraction on Tourism After COVID-19 Pandemic Ida Bagus Putra Manuaba; I Wayan Budi Sentana; I Nyoman Gede Arya Astawa; I Wayan Suasnawa; I Putu Bagus Arya Pradnyana
Knowledge Engineering and Data Science Vol 5, No 2 (2022)
Publisher : Universitas Negeri Malang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.17977/um018v5i22022p143-149

Abstract

Social media mining is an emerging technique for analyzing data to extract valuable knowledge related to various domains. However, traditional text matching techniques, such as exact matching, are not always suitable for social media data, which can contain spelling mistakes, abbreviations, and variations in the use of words. Fuzzy matching is a text matching technique that can handle such variations and identify similarities between two texts, even if there are differences in spelling or phrasing. The gap in existing research is the limited use of fuzzy matching in social media mining for tourism recovery analysis. By applying fuzzy matching to social media data related to COVID-19 and tourism recovery, this research seeks to bridge this gap and extract valuable insights related to the impact of the pandemic on tourism recovery. We manually retrieved 19,462 Twitter records and differentiated the data sources using four diver parameters to indicate data related to the impact of COVID-19 on the tourism industry, such as the economy, restrictions, government policies, and vaccination. We conducted text mining analysis on the collected 7,352 words and identified 25 highly recommended words that indicated COVID-19 recovery from a tourism perspective. We separated the four words representing the tourism perspective to perform fuzzy matching as a dataset. We then used the inbound dataset on the fuzzy matching process, with the 7,352-word data collected from the text mining process. The matching process resulted in 18 words representing COVID-19 recovery from a tourism perspective.
Hybrid Artificial Bee Colony and Improved Simulated Annealing for the Capacitated Vehicle Routing Problem Farhanna Mar'i; Hafidz Ubaidillah; Wayan Firdaus Mahmudy; Ahmad Afif Supianto
Knowledge Engineering and Data Science Vol 5, No 2 (2022)
Publisher : Universitas Negeri Malang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.17977/um018v5i22022p109-121

Abstract

Capacitated Vehicle Routing Problem (CVRP) is a type of NP-Hard combinatorial problem that requires a high computational process. In the case of CVRP, there is an additional constraint in the form of a capacity limit owned by the vehicle, so the complexity of the problem from CVRP is to find the optimum route pattern for minimizing travel costs which are also adjusted to customer demand and vehicle capacity for distribution. One method of solving CVRP can be done by implementing a meta-heuristic algorithm. In this research, two meta-heuristic algorithms have been hybridized: Artificial Bee Colony (ABC) with Improved Simulated Annealing (SA). The motivation behind this idea is to complete the excess and the lack of two algorithms when exploring and exploiting the optimal solution. Hybridization is done by running the ABC algorithm, and then the output solution at this stage will be used as an initial solution for the Improved SA method. Parameter testing for both methods has been carried out to produce an optimal solution. In this study, the test was carried out using the CVRP benchmark dataset generated by Augerat (Dataset 1) and the recent CVRP dataset from Uchoa (Dataset 2). The result shows that hybridizing the ABC algorithm and Improved SA could provide a better solution than the basic ABC without hybridization.

Page 6 of 10 | Total Record : 98