Claim Missing Document
Check
Articles

Found 2 Documents
Search
Journal : Journal of Applied Data Sciences

Implementation of PageRank Algorithm for Visualization and Weighting of Keyword Networks in Scientific Papers Lubis, Adyanata; Prasiwiningrum, Elyandri
Journal of Applied Data Sciences Vol 4, No 4: DECEMBER 2023
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v4i4.138

Abstract

Papers are written works that contain thoughts about a particular problem or topic that are written systematically accompanied by logical analysis. Scientific papers are often found on the internet or in libraries for various titles of scientific papers, citations or references can be found in every scientific paper and can be obtained easily, but to display all citations in scientific papers in the form of visualization cannot be done easily. Visualizing the citation network of scientific papers in the form of a graph, with nodes representing research papers and edges representing the relationship between researchers' scientific papers and other scientific papers based on scientific paper citations. This research uses the pagerank algorithm to create a keyword network that has a high relationship and potential relevance in a data library. This research is the first research in using the pagerank algorithm and testing its accuracy by comparing with KNN and linear clustering. The presentation displays the citation of scientific papers based on the size of the node by showing the number of citations of the scientific paper. It can be concluded that all processes in the system have run according to design, and functionally the visualization system and weighting of the scientific paper citation network are in accordance with the design. The results obtained from 51 articles, this algorithm produces a visual user interest of 81.60%, compared to the accuracy of the data suitability produced by the linear clustering and KNN algorithms in the form of 71.22% and 61.34%, helping to facilitate the search for scientific papers in large quantities.
Leveraging K-Nearest Neighbors with SMOTE and Boosting Techniques for Data Imbalance and Accuracy Improvement Lubis, Adyanata; Irawan, Yuda; Junadhi, Junadhi; Defit, Sarjon
Journal of Applied Data Sciences Vol 5, No 4: DECEMBER 2024
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v5i4.343

Abstract

This research addresses the issue of low accuracy in sentiment analysis on Israeli products on social media, initially achieving only 64% using the K-NN algorithm. Given the ongoing Israeli-Palestinian conflict, which has garnered widespread international attention and strong opinions, understanding public sentiment towards Israeli products is crucial. To improve accuracy, the study employs SMOTE to handle data imbalance and combines K-NN with boosting algorithms like AdaBoost and XGBoost, which were selected for their effectiveness in improving model performance on imbalanced and complex datasets. AdaBoost was chosen for its ability to enhance model accuracy by focusing on misclassified instances, while XGBoost was selected for its efficiency and robustness in handling large datasets with multiple features. The research process includes data pre-processing (cleaning, normalization, tokenization, stopwords removal, and stemming), labeling using a Lexicon-Based approach, and feature extraction with CountVectorizer and TF-IDF. SMOTE was applied to oversample the minority class to match the number of instances in the majority class, ensuring balanced representation before model training. A total of 1,145 datasets were divided into training and testing data with a ratio of 70:30. Results demonstrate that SMOTE increased K-NN accuracy to 77%. Interestingly, combining K-NN with AdaBoost after SMOTE achieved 72% accuracy, which, although lower than the 77% achieved with SMOTE alone, was higher than the 68% accuracy without SMOTE. This discrepancy can be attributed to the added complexity introduced by AdaBoost, which may not synergize as effectively with SMOTE as XGBoost does, particularly in this dataset's context. In contrast, K-NN with XGBoost after SMOTE reached the highest accuracy of 88%, demonstrating a more effective combination. Boosting without SMOTE resulted in lower accuracies: 68% for KNN+AdaBoost and 64% for KNN+XGBoost. The combination of K-NN with SMOTE and XGBoost significantly improves model accuracy and reliability for sentiment analysis on social media.