cover
Contact Name
Teuku Rizky Noviandy
Contact Email
trizkynoviandy@gmail.com
Phone
+6282275731976
Journal Mail Official
editorial-office@heca-analitika.com
Editorial Address
Jl. Makam T. Nyak Arief Kompleks BUPERTA Blok L7B, Lamgapang, Aceh Besar, Provinsi Aceh
Location
Kab. aceh besar,
Aceh
INDONESIA
Infolitika Journal of Data Science
ISSN : -     EISSN : 30258618     DOI : https://doi.org/10.60084/ijds
Infolitika Journal of Data Science is a distinguished international scientific journal that showcases high caliber original research articles and comprehensive review papers in the field of data science. The journals core mission is to stimulate interdisciplinary research collaboration, facilitate the exchange of knowledge, and drive the advancement and application of innovative strategies within the data science domain. Topics of this journal includes, but not limited to Data Mining and Analysis, Machine Learning and Artificial Intelligence, Big Data and Data Engineering, Predictive Modeling and Forecasting, Natural Language Processing, Computer Vision, Data Visualization and Interpretation, Ethics and Privacy in Data Science, Applications of Data Science, Interdisciplinary Approaches
Articles 5 Documents
Search results for , issue "Vol. 2 No. 2 (2024): November 2024" : 5 Documents clear
Forecasting Upwelling Phenomena in Lake Laut Tawar: A Semi-Supervised Learning Approach Ulhaq, Muhammad Zia; Farid, Muhammad; Aziza, Zahra Ifma; Nuzullah, Teuku Muhammad Faiz; Syakir, Fakhrus; Sasmita, Novi Reandy
Infolitika Journal of Data Science Vol. 2 No. 2 (2024): November 2024
Publisher : Heca Sentra Analitika

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.60084/ijds.v2i2.211

Abstract

The current climate change is causing the upwelling phenomenon to occur frequently in lakes and reservoirs. As a result of this phenomenon, thousands of fish die, causing floating net cage fish farmers to suffer losses. From existing studies, temperature sensors are used to determine the current condition of a body of water experiencing upwelling or not. Therefore, this study applies clustering to historical climate data from 2017-2023 using a semi-supervised learning approach that produces two labels: "potential for upwelling" and "no potential for upwelling." In the clustering process, the data is divided into two clusters using K-Means Clustering, and Support Vector Machine (SVM) is chosen to classify them. The performance of the proposed algorithm is expressed with accuracy, precision, recall, and F1-score values of 0.99, 0.995, 0.970, and 0.985, respectively. The analysis results show that this model has excellent performance in identifying upwelling potential. By using this method, information about upwelling potential can be obtained more quickly and accurately, allowing fish farmers to take appropriate preventive measures. This study also shows that the combination of K-Means Clustering and Support Vector Machine (SVM) can be effectively used to analyze historical climate data and generate useful predictions.
Artificial Neural Network–Particle Swarm Optimization Approach for Predictive Modeling of Kovats Retention Index in Essential Oils Kurniadinur, Kurniadinur; Noviandy, Teuku Rizky; Idroes, Ghazi Mauer; Ahmad, Noor Atinah; Irvanizam, Irvanizam; Subianto, Muhammad; Idroes, Rinaldi
Infolitika Journal of Data Science Vol. 2 No. 2 (2024): November 2024
Publisher : Heca Sentra Analitika

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.60084/ijds.v2i2.220

Abstract

The Kovats retention index is a critical parameter in gas chromatography used for the identification of volatile compounds in essential oils. Traditional methods for determining the Kovats retention index are often labor-intensive, time-consuming, and prone to inaccuracies due to variations in experimental conditions. This study presents a novel approach combining Artificial Neural Networks (ANN) with Particle Swarm Optimization (PSO) to predict the Kovats retention index of essential oil compounds more accurately and efficiently. The ANN-PSO hybrid model leverages the strengths of both techniques: the ANN's capacity to model complex nonlinear relationships and PSO's capability to optimize hyperparameters by finding the global optimum. The model was trained using a dataset of 340 essential oil compounds with molecular descriptors, with the performance evaluated based on Root Mean Squared Error (RMSE) and Mean Absolute Percentage Error (MAPE). Results indicate that a simpler ANN configuration with one hidden neuron achieved the lowest RMSE (80.16) and MAPE (5.65%), suggesting that the relationship between the molecular descriptors and the Kovats retention index is not overly complex. This study demonstrates that the ANN-PSO model can serve as an effective tool for predictive modeling of the Kovats retention index, reducing the need for experimental procedures and improving analytical efficiency in essential oil research.
Performance Assessment of Machine Learning and Transformer Models for Indonesian Multi-Label Hate Speech Detection Bagestra, Ricky; Misbullah, Alim; Zulfan, Zulfan; Rasudin, Rasudin; Farsiah, Laina; Nazhifah, Sri Azizah
Infolitika Journal of Data Science Vol. 2 No. 2 (2024): November 2024
Publisher : Heca Sentra Analitika

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.60084/ijds.v2i2.235

Abstract

Hate speech, characterized by language that incites discrimination, hostility, or violence against individuals or groups based on attributes such as race, religion, or gender, has become a critical issue on social media platforms. In Indonesia, unique linguistic complexities, such as slang, informal expressions, and code-switching, complicate its detection. This study evaluates the performance of Support Vector Machine (SVM), Naive Bayes, and IndoBERT models for multi-label hate speech detection on a dataset of 13,169 annotated Indonesian tweets. The results show that IndoBERT outperforms SVM and Naive Bayes across all metrics, achieving an accuracy of 93%, F1-score of 91%, precision of 91%, and recall of 91%. IndoBERT's contextual embeddings effectively capture nuanced relationships and complex linguistic patterns, offering superior performance in comparison to traditional methods. The study addresses dataset imbalance using BERT-based data augmentation, leading to significant metric improvements, particularly for SVM and Naive Bayes. Preprocessing steps proved essential in standardizing the dataset for effective model training. This research underscores IndoBERT's potential for advancing hate speech detection in non-English, low-resource languages. The findings contribute to the development of scalable, language-specific solutions for managing harmful online content, promoting safer and more inclusive digital environments.
Fine-Tuning Topic Modelling: A Coherence-Focused Analysis of Correlated Topic Models Syahrial, Syahrial; Afidh, Razief Perucha Fauzie
Infolitika Journal of Data Science Vol. 2 No. 2 (2024): November 2024
Publisher : Heca Sentra Analitika

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.60084/ijds.v2i2.236

Abstract

The Correlated Topic Model (CTM) is a widely used approach for topic modelling that accounts for correlations among topics. This study investigates the effects of hyperparameter tuning on the model's ability to extract meaningful themes from a corpus of unstructured text. Key hyperparameters examined include learning rates (0.1, 0.01, 0.001), the number of topics (3, 5, 7, 10), and the number of top words (10, 20, 30, 40, 50, 80, 100). The Adam optimizer was used for model training, and performance was evaluated using the coherence score (c_v), a metric that assesses the interpretability and coherence of the generated topics. The dataset comprised 100 articles, and results were visualized using line plots and heatmaps to highlight performance trends. The highest coherence score of 0.803 was achieved with three topics and 10 top words. The findings demonstrate that fine-tuning hyperparameters significantly improves the model's ability to generate coherent and interpretable topics, resulting in more accurate and insightful outcomes. This research underscores the importance of parameter optimization in enhancing the effectiveness of CTM for topic modelling applications.
Advanced Anemia Classification Using Comprehensive Hematological Profiles and Explainable Machine Learning Approaches Noviandy, Teuku Rizky; Idroes, Ghifari Maulana; Suhendra, Rivansyah; Bakri, Tedy Kurniawan; Idroes, Rinaldi
Infolitika Journal of Data Science Vol. 2 No. 2 (2024): November 2024
Publisher : Heca Sentra Analitika

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.60084/ijds.v2i2.237

Abstract

Anemia is a common health issue with serious clinical effects, making timely and accurate diagnosis essential to prevent complications. This study explores the use of machine learning (ML) methods to classify anemia and its subtypes using detailed hematological data. Six ML models were tested: Gradient Boosting, Random Forest, Naive Bayes, Logistic Regression, Support Vector Machine, and K-Nearest Neighbors. The dataset was preprocessed using feature standardization and the Synthetic Minority Oversampling Technique (SMOTE) to address class imbalance. Gradient Boosting delivered the highest accuracy, sensitivity, and F1-score, establishing itself as the top-performing model. SHapley Additive exPlanations (SHAP) analysis was applied to enhance model interpretability, identifying key predictive features. This study highlights the potential of explainable ML to develop efficient, accurate, and scalable tools for anemia diagnosis, fostering improved healthcare outcomes globally.

Page 1 of 1 | Total Record : 5