cover
Contact Name
Aji Prasetya Wibawa
Contact Email
keds.journal@um.ac.id
Phone
+62818539333
Journal Mail Official
keds.journal@um.ac.id
Editorial Address
Semarang St. No 5, Malang, Indonesia
Location
Kota malang,
Jawa timur
INDONESIA
Knowledge Engineering and Data Science
ISSN : -     EISSN : 25974637     DOI : https://doi.org/10.17977
Knowledge Engineering and Data Science (2597-4637), KEDS, brings together researchers, industry practitioners, and potential users, to promote collaborations, exchange ideas and practices, discuss new opportunities, and investigate analytics frameworks on data-driven and knowledge base systems.
Articles 98 Documents
Can Multinomial Logistic Regression Predicts Research Group using Text Input? Harits Ar Rosyid; Aulia Yahya Harindra Putra; Muhammad Iqbal Akbar; Felix Andika Dwiyanto
Knowledge Engineering and Data Science Vol 5, No 2 (2022)
Publisher : Universitas Negeri Malang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.17977/um018v5i22022p150-159

Abstract

While submitting proposals in SISINTA, students often confuse or falsely submit their proposals to the less relevant or incorrect research group. There are 13 research groups for the students to choose from. We proposed a text classification method to help students find the best research group based on the title and/or abstract. The stages in this study include data collection, preprocessing data, classification using Logistic Regression, and evaluation of the results. Three scenarios in research group classification are based on 1) title only, 2) abstract only, and 3) title and abstract. Based on the experiments, research group classification using title-only input is the best overall. This scenario gets the most optimal results with accuracy, precision, recall, and f1-score successively at 63.68%, 64.91%, 63.68%, and 63.46%. This result is sufficient to help students find the best research group based on the text titles. In addition, lecturers can comment more elaborately since the proposals are relevant to the research group’s scope.
Indonesian Language Term Extraction using Multi-Task Neural Network Joan Santoso; Esther Irawati Setiawan; Fransiskus Xaverius Ferdinandus; Gunawan Gunawan; Leonel Hernandez
Knowledge Engineering and Data Science Vol 5, No 2 (2022)
Publisher : Universitas Negeri Malang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.17977/um018v5i22022p160-167

Abstract

The rapidly expanding size of data makes it difficult to extricate information and store it as computerized knowledge. Relation extraction and term extraction play a crucial role in resolving this issue. Automatically finding a concealed relationship between terms that appear in the text can help people build computer-based knowledge more quickly. Term extraction is required as one of the components because identifying terms that play a significant role in the text is the essential step before determining their relationship. We propose an end-to-end system capable of extracting terms from text to address this Indonesian language issue. Our method combines two multilayer perceptron neural networks to perform Part-of-Speech (PoS) labeling and Noun Phrase Chunking. Our models were trained as a joint model to solve this problem. Our proposed method, with an f-score of 86.80%, can be considered a state-of-the-art algorithm for performing term extraction in the Indonesian Language using noun phrase chunking.
Traffic Density Prediction using IoT-based Double Exponential Smoothing Rosa Andrie Asmara; Noprianto Noprianto; Muhammad Ainur Ilmy; Kohei Arai
Knowledge Engineering and Data Science Vol 5, No 2 (2022)
Publisher : Universitas Negeri Malang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.17977/um018v5i22022p168-178

Abstract

The number of vehicles and currents that tend to increase causes traffic density. A system is proposed to calculate the number of vehicles and predict real-time traffic density. This research uses Haar Cascade to detect the number of cars and motorcycles and the Double Exponential Smoothing (DES) for forecasting the number of vehicles on the road. MAPE describes forecasting accuracy as a base for selecting the best smoothing constant (Alpha). The best test results from June 13 to 20, 2020, are cars on June 14, 2020 (alpha 0.5, MAPE 0%) and Motorcylecycles on June 18, 2020 (alpha 0.5, MAPE 0.1134% ). The most significant MAPE results of the car were on June 15, 2020, with alpha 0.5 and MAPE 2.1073%. The 3 minutes haar cascade detects 72.58% of cars and 81.90% of motorcycles.
Ant Colony Optimization for Resistor Color Code Detection Slamet Wibawanto; Kartika Candra Kirana; Hani Ramadhan
Knowledge Engineering and Data Science Vol 6, No 1 (2023)
Publisher : Universitas Negeri Malang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.17977/um018v6i12023p15-23

Abstract

In the early stages of learning resistors, introducing color-based values is needed. Moreover, some combinations require a resistor trip analysis to identify. Unfortunately, a resistor body color is considered a local solution, which often confuses resistor coloration. Ant Colony Optimization (ACO) is a heuristic algorithm that can recognize problems with traveling a group of ants. ACO is proposed to select commercial matrix values to be computed without preventing local solutions. In this study, each explores the matrix based on pheromones and heuristic information to generate local solutions. Global solutions are selected based on their high degree of similarity with other local solutions. The first stage of testing focuses on exploring variations of parameter values. Applying the best parameters resulted in 85% accuracy and 43 seconds for 20 resistor images. This method is expected to prevent local solutions without wasteful computation of the matrix.
Associated Patterns in Open-Ended Concept Maps within E-Learning Didik Dwi Prasetya; Tsukasa Hirashima
Knowledge Engineering and Data Science Vol 5, No 2 (2022)
Publisher : Universitas Negeri Malang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.17977/um018v5i22022p179-187

Abstract

A concept map is a diagram that visualizes the structure of individual cognitive knowledge. An approach to creating a concept map structure that allows users to contribute concepts and linkages that express their understanding freely is known as an "open-ended concept map." It has been demonstrated that an open-ended concept map accurately depicts student knowledge structures and reveals student differences. However, manually analyzing an open-ended map is difficult, time-consuming, and includes many propositions, especially in a big classroom. Educational data mining could be used to further process and analyze a collection of concept maps. However, many works attempted to employ data mining in order to produce concept maps structure from text documents rather than examining the knowledge representation. This study aimed to identify hidden students' knowledge representation combination patterns using association rules analysis. The dataset used in this study consisted of 27 open-concept maps created by university students. This study found interesting patterns that reveal students' knowledge in understanding the material given by the teacher.
Predicting Heart Disease using Logistic Regression Mochammad Anshori; M. Syauqi Haris
Knowledge Engineering and Data Science Vol 5, No 2 (2022)
Publisher : Universitas Negeri Malang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.17977/um018v5i22022p188-196

Abstract

A common risk of death is caused by heart disease. It is critical in the field of medicine to be able to diagnose cardiac disease in order to adequately prevent and treat patients. The most accurate method of prediction has the potential to both extend the patient's life and reduce the severity of their cardiac disease. The use of machine learning is one approach that may be taken to generate predictions. In this study, patient medical record information was used in conjunction with an algorithm for logistic regression in order to make heart disease diagnoses. The outcomes of the logistic regression have been utilized to achieve a high level of accuracy in the prediction of heart disease. To get the model coefficients needed for the equation, the experiment uses an iterative form of the logistic regression test. Iteration 14 produced the best results, with an accuracy of 81.3495% and an average calculation time of 0.020 seconds. The best iteration was reached at that point. The percentage of space that lies beneath the ROC curve is 89.36%. The findings of this study have significant implications for the field of heart disease prediction and can contribute to improved patient care and outcomes. Accurate predictions obtained through logistic regression can guide healthcare professionals in identifying individuals at risk and implementing preventive measures or tailored treatment plans. The computational efficiency of the model further enhances its applicability in real-time decision support systems.
Inter-Frame Video Compression based on Adaptive Fuzzy Inference System Compression of Multiple Frame Characteristics Arief Bramanto Wicaksono Putra; Rheo Malani; Bedi Suprapty; Achmad Fanany Onnilita Gaffar; Roman Voliansky
Knowledge Engineering and Data Science Vol 6, No 1 (2023)
Publisher : Universitas Negeri Malang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.17977/um018v6i12023p1-14

Abstract

Video compression is used for storage or bandwidth efficiency in clip video information. Video compression involves encoders and decoders. Video compression uses intra-frame, inter-frame, and block-based methods.  Video compression compresses nearby frame pairs into one compressed frame using inter-frame compression. This study defines odd and even neighboring frame pairings. Motion estimation, compensation, and frame difference underpin video compression methods. In this study, adaptive FIS (Fuzzy Inference System) compresses and decompresses each odd-even frame pair. First, adaptive FIS trained on all feature pairings of each odd-even frame pair. Video compression-decompression uses the taught adaptive FIS as a codec. The features utilized are "mean", "std (standard deviation)", "mad (mean absolute deviation)", and "mean (std)". This study uses all video frames' average DCT (Discrete Cosine Transform) components as a quality parameter. The adaptive FIS training feature and amount of odd-even frame pairings affect compression ratio variation. The proposed approach achieves CR=25.39% and P=80.13%. "Mean" performs best overall (P=87.15%). "Mean (mad)" has the best compression ratio (CR=24.68%) for storage efficiency. The "std" feature compresses the video without decompression since it has the lowest quality change (Q_dct=10.39%).
The Effect of the Number of Hidden Layers on The Performance of Deep Q-Network for Traveling Salesman Problem Hanif, Benzfica; Larasati, Aisyah; Nurdiansyah, Rudi; Le, Trung
Knowledge Engineering and Data Science Vol 6, No 2 (2023)
Publisher : Universitas Negeri Malang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.17977/um018v6i22023p188-198

Abstract

The Traveling Salesman Problem (TSP) effectively represents the complex distribution issues encountered by couriers, who must carefully plan a route that includes all customer addresses while minimizing the distance traveled. As the magnitude of deliveries and the range of destinations expand, the courier's responsibility becomes progressively challenging. In this particular context, the objective of our research is to expand the existing knowledge and explore the complete capabilities of Deep Q-Network (DQN) models in order to achieve the most efficient route determination. This endeavor can potentially bring about significant changes in the courier and delivery service sector. The foundation of our unique methodology relies on an empirical inquiry, utilizing a comprehensive dataset including 178 observations obtained from motorcycle-based package delivery agents. Our research is carefully planned and executed using a comprehensive factorial experimental design. This design incorporates three crucial factors: the number of hidden layers, episodes, and epochs. The hidden layer parameter is set to a singular level, while the episode parameter is configured to explore five levels, and the epoch parameter is designed to travel four levels. The evaluation of our DQN models' performance is conducted utilizing the MSE metric as a measure. This assessment is carried out at every iterative cycle, ensuring thorough scrutiny. The central focus of our research centers on the intricate connection between episodes and epochs, and their influence on MSE. The findings of our study reveal that the association between episodes, epochs, and errors is not statistically significant although different level of episodes and epochs produces slightly different level of error.
Exploring the Impact of Students Demographic Attributes on Performance Prediction through Binary Classification in the KDP Model Issah Iddrisu; Peter Appiahene; Obed Appiah; Inusah Fuseini
Knowledge Engineering and Data Science Vol 6, No 1 (2023)
Publisher : Universitas Negeri Malang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.17977/um018v6i12023p24-40

Abstract

During the course of this research, binary classification and the Knowledge Discovery Process (KDP) were used. The experimental and analytical capabilities of Rapid Miner's 9.10.010 instructional environment are supported by five different classifiers. Included in the analysis were 2334 entries, 17 characteristics, and one class variable containing the students' average score for the semester. There were twenty experiments carried out. During the studies, 10-fold cross-validation and ratio split validation, together with bootstrap sampling, were used. It was determined whether or not to use the Random Forest (RF), Rule Induction (RI), Naive Bayes (NB), Logistic Regression (LR), or Deep Learning (DL) methods. RF outperformed the other four methods in all six selection measures, with an accuracy of 93.96%. According to the RF classifier model, the level of education that a child's parents have is a major factor in that child's academic performance before entering higher education.
Long-Term Traffic Prediction Based on Stacked GCN Model Atkia Akila Karim; Naushin Nower
Knowledge Engineering and Data Science Vol 6, No 1 (2023)
Publisher : Universitas Negeri Malang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.17977/um018v6i12023p92-102

Abstract

With the recent surge in road traffic within major cities, the need for both short and long-term traffic flow forecasting has become paramount for city authorities. Previous research efforts have predominantly focused on short-term traffic flow estimations for specific road segments and paths. However, applications of paramount importance, such as traffic management and schedule routing planning, demand a deep understanding of long-term traffic flow predictions. However, due to the intricate interplay of underlying factors, there exists a scarcity of studies dedicated to long-term traffic prediction. Previous research has also highlighted the challenge of lower accuracy in long-term predictions owing to error propagation within the model. This model effectively combines Graph Convolutional Network (GCN) capacity to extract spatial characteristics from the road network with the stacked GCN aptitude for capturing temporal context. Our developed model is subsequently employed for traffic flow forecasting within urban road networks. We rigorously compare our method against baseline techniques using two real-world datasets. Our approach significantly reduces prediction errors by 40% to 60% compared to other methods. The experimental results underscore our model's ability to uncover spatiotemporal dependencies within traffic data and its superior predictive performance over baseline models using real-world traffic datasets.

Page 7 of 10 | Total Record : 98