cover
Contact Name
Teuku Rizky Noviandy
Contact Email
trizkynoviandy@gmail.com
Phone
+6282275731976
Journal Mail Official
editorial-office@heca-analitika.com
Editorial Address
Jl. Makam T. Nyak Arief Kompleks BUPERTA Blok L7B, Lamgapang, Aceh Besar, Provinsi Aceh
Location
Kab. aceh besar,
Aceh
INDONESIA
Infolitika Journal of Data Science
ISSN : -     EISSN : 30258618     DOI : https://doi.org/10.60084/ijds
Infolitika Journal of Data Science is a distinguished international scientific journal that showcases high caliber original research articles and comprehensive review papers in the field of data science. The journals core mission is to stimulate interdisciplinary research collaboration, facilitate the exchange of knowledge, and drive the advancement and application of innovative strategies within the data science domain. Topics of this journal includes, but not limited to Data Mining and Analysis, Machine Learning and Artificial Intelligence, Big Data and Data Engineering, Predictive Modeling and Forecasting, Natural Language Processing, Computer Vision, Data Visualization and Interpretation, Ethics and Privacy in Data Science, Applications of Data Science, Interdisciplinary Approaches
Articles 25 Documents
Developing a Regional Framework for Disaster Risk Reduction Based on Disaster-Related Data from Aceh, Indonesia Yolanda, Yolanda; Oktari, Rina Suryani; Munawar, Munawar; Lola, Muhamad Safiih; Sofyan, Hizir
Infolitika Journal of Data Science Vol. 3 No. 1 (2025): May 2025
Publisher : Heca Sentra Analitika

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.60084/ijds.v3i1.269

Abstract

Aceh Province is highly vulnerable to various hazards, necessitating effective disaster risk reduction strategies. This study aims to develop an instrument to evaluate disaster risk reduction efforts in Aceh Province and to assess progress toward global disaster resilience targets. The data includes secondary disaster-related records from 2005 to 2024 and primary data from the instrument validation process, demonstrating excellent validity results based on the Content Validity Ratio (CVR) and Content Validity Index (CVI). The findings highlight significant improvements in key areas, including reductions in disaster mortality, affected populations, economic losses, damage to critical infrastructure, and strengthened early warning systems. However, challenges persist in implementing local disaster risk reduction strategies and enhancing international cooperation. This study offers practical insights for policymakers and contributes to strengthening disaster resilience and advancing disaster risk management research in sub-national contexts.
Similarity-Based Network in the Industrial Community of Joyo City Takeuchi, Keita; Iwasaki, Masashi; Shinjo, Masato
Infolitika Journal of Data Science Vol. 3 No. 1 (2025): May 2025
Publisher : Heca Sentra Analitika

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.60084/ijds.v3i1.267

Abstract

Data utilization is becoming increasingly widespread in a variety of fields around the world, and has become especially important in the industrial world. Data utilization techniques and approaches can contribute to the development of not only individual companies but also certain groups of companies. In this paper, we consider the industrial structure of Joyo City, Japan, by analyzing data collected through interviews with company presidents and managers. The main purpose of this paper is to grasp it in terms of similarity across industrial categories. We first express the features of each company as a vector with entries determined from the interview data. We then compute vector similarities in order to draw a graphical network, in which nodes corresponding to similar companies are linked by an edge. From the resulting network, we derive the most similar companies in the same and different industrial categories for each company. Moreover, we then classify Joyo City's companies into new groups across the standard categories.
Optimizing Energy Consumption Prediction Across the IMT-GT Region Through PCA-Based Modeling Farid, Muhammad; Nuzullah, Teuku Muhammad Faiz; Aklya, Zatul; Nazila, Syifa; Ulhaq , Muhammad Zia; Apriliansyah, Feby; Sasmita, Novi Reandy
Infolitika Journal of Data Science Vol. 3 No. 1 (2025): May 2025
Publisher : Heca Sentra Analitika

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.60084/ijds.v3i1.286

Abstract

This study aims to improve the accuracy of energy consumption prediction in the Indonesia-Malaysia-Thailand Growth Triangle (IMT-GT) region by addressing multicollinearity among independent variables such as energy production (Mtoe), lignite coal production (million tons), crude oil production (million tons), refined oil production (million tons), natural gas production (billion cubic meters), and electricity production (terawatt-hours). By integrating Principal Component Analysis (PCA) with Random Forest (RF), six correlated variables were reduced into two uncorrelated principal components (PC1 and PC2), explaining 80.77% of the data variance. The PCA-RF hybrid model outperformed the standalone Random Forest (RF) model, with an increase in the coefficient of determination (R2) from 0.976 to 0.993. Additionally, it achieved significant reductions in error metrics, with the mean absolute error (MAE) decreasing from 5.811 to 4.169 and the root mean square error (RMSE) dropping from 9.278 to 4.786. These results demonstrate PCA’s effectiveness in isolating dominant drivers such as energy and lignite coal production while improving model stability. The framework provides policymakers with a reliable tool to forecast energy demand and align economic growth with sustainability in fossil fuel-dependent economies.
Explainable Deep Learning with Lightweight CNNs for Tuberculosis Classification Noviandy, Teuku Rizky; Idroes, Ghazi Mauer; Zulfikar, Teuku; Idroes, Rinaldi
Infolitika Journal of Data Science Vol. 3 No. 1 (2025): May 2025
Publisher : Heca Sentra Analitika

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.60084/ijds.v3i1.305

Abstract

Tuberculosis (TB) remains a major global health threat, particularly in low-resource settings where timely diagnosis is critical yet often limited by the lack of radiological expertise. Chest X-rays (CXRs) are widely used for TB screening, but manual interpretation is prone to errors and variability. While deep learning has shown promise in automating CXR analysis, most existing models are computationally intensive and lack interpretability, limiting their deployment in real-world clinical environments. To address this gap, we evaluated three lightweight and explainable CNN architectures, ShuffleNetV2, SqueezeNet 1.1, and MobileNetV3, for binary TB classification using a locally sourced dataset of 3,008 CXR images. Using transfer learning and Grad-CAM for visual explanation, we show that MobileNetV3 and ShuffleNetV2 achieved perfect test performance with 100% accuracy, sensitivity, specificity, precision, and F1-score, along with AUC scores of 1.00 and inference times of 94.66 and 103.63 seconds, respectively. SqueezeNet performed moderately, with a lower F1-score of 82.98% and several misclassifications. These results demonstrate that lightweight CNNs can deliver high diagnostic accuracy and transparency, supporting their use in scalable, AI-assisted TB screening systems for underserved healthcare settings.
Inductive Biases in Feature Reduction for QSAR: SHAP vs. Autoencoders Noviandy, Teuku Rizky; Idroes, Ghifari Maulana; Lala, Andi; Helwani, Zuchra; Idroes, Rinaldi
Infolitika Journal of Data Science Vol. 3 No. 1 (2025): May 2025
Publisher : Heca Sentra Analitika

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.60084/ijds.v3i1.306

Abstract

Machine learning models in drug discovery often depend on high-dimensional molecular descriptors, many of which may be redundant or irrelevant. Reducing these descriptors is essential for improving model performance, interpretability, and computational efficiency. This study compares two widely used reduction strategies: SHAP-based feature selection and autoencoder-based compression, within the context of Quantitative Structure-Activity Relationship (QSAR) classification. LightGBM is used as a consistent modeling framework to evaluate models trained on all descriptors, the top 50 and 100 SHAP-ranked descriptors, and a 64-dimensional autoencoder embedding. The results show that SHAP-based selection produces interpretable and stable models with minimal performance loss, particularly when using the top 100 descriptors. In contrast, the autoencoder achieves the highest test performance by capturing nonlinear patterns in a compact, low-dimensional representation, although this comes at the cost of interpretability and consistency across data splits. These findings reflect the differing inductive biases of each method. SHAP prioritizes sparsity and attribution, while autoencoders focus on reconstruction and continuity. The analysis emphasizes that descriptor reduction strategies are not interchangeable. SHAP-based selection is suitable for applications where interpretability and reliability are essential, such as in hypothesis-driven or regulatory settings. Autoencoders are more appropriate for performance-driven tasks, including virtual screening. The choice of reduction strategy should be guided not only by performance metrics but also by the specific modeling requirements and assumptions relevant to cheminformatics workflows.

Page 3 of 3 | Total Record : 25