Claim Missing Document
Check
Articles

Found 7 Documents
Search
Journal : Infolitika Journal of Data Science

ANFIS-Based QSRR Modelling for Kovats Retention Index Prediction in Gas Chromatography Idroes, Rinaldi; Noviandy, Teuku Rizky; Maulana, Aga; Suhendra, Rivansyah; Sasmita, Novi Reandy; Muslem, Muslem; Idroes, Ghazi Mauer; Jannah, Raudhatul; Afidh, Razief Perucha Fauzie; Irvanizam, Irvanizam
Infolitika Journal of Data Science Vol. 1 No. 1 (2023): September 2023
Publisher : Heca Sentra Analitika

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.60084/ijds.v1i1.73

Abstract

This study aims to evaluate the implementation and effectiveness of the Adaptive Neuro-Fuzzy Inference System (ANFIS) based Quantitative Structure Retention Relationship (QSRR) to predict the Kovats retention index of compounds in gas chromatography. The model was trained using 340 essential oil compounds and their molecular descriptors. The evaluation of the ANFIS models revealed promising results, achieving an R2 of 0.974, an RMSE of 48.12, and an MAPE of 3.3% on the testing set. These findings highlight the ANFIS approach as remarkably accurate in its predictive capacity for determining the Kovats retention index in the context of gas chromatography. This study provides valuable perspectives on the efficiency of retention index prediction through ANFIS-based QSRR methods and the potential practicality in compound analysis and chromatographic optimization.
Maternal and Child Healthcare Services in Aceh Province, Indonesia: A Correlation and Clustering Analysis in Statistics Sasmita, Novi Reandy; Ramadeska, Siti; Utami, Reksi; Adha, Zuhra; Putri, Ulayya; Syarafina, Risky Haezah; Reskiaddin, La Ode; Kamal, Saiful; Yarmaliza, Yarmaliza; Muliadi, Muliadi; Saputra, Arif
Infolitika Journal of Data Science Vol. 1 No. 1 (2023): September 2023
Publisher : Heca Sentra Analitika

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.60084/ijds.v1i1.88

Abstract

Infant mortality remains a public health problem in Aceh Province, Indonesia. Health services during pregnancy are an essential factor in reducing infant mortality. Studies examining factors such as maternal and child health services that have implications for infant mortality in Aceh province are still scarce. Therefore, this study aims to examine the correlation between maternal and child health services variables such as Blood-Supplementing Tablets (TTD), Coverage of the First Visit of Pregnant Women (K1), Coverage of the First Visit of Pregnant Women (K4), and management of Obstetric Complications to live births and to map the maternal and child health services obtained during pregnancy. A cross-sectional study was used as the research study. This study used descriptive statistics, such as measures of data centering and data dispersion. In this work, inferential statistical analysis was conducted using the Shapiro-Wilk test, Spearman test, and fuzzy c-means. The result of the Shapiro Wilk test stated that the live birth rate variable and all Maternal and Child Healthcare Services variables were not normally distributed (p-value < 0.05), all Maternal and Child Healthcare Services variables were positively correlated to live birth rate based on the Spearman test (p-value < 0.05). Based on the Silhouette Index with 0.555, the formation of 3 clusters is the optimal cluster. The clustering is based on the Maternal and Child Healthcare Services that have been provided, where the first, second, and third clusters consist of five districts/city, eight districts/city, and ten districts/city, respectively, as a result of Fuzzy C-Means Clustering.
A Statistical Clustering Approach: Mapping Population Indicators Through Probabilistic Analysis in Aceh Province, Indonesia Sasmita, Novi Reandy; Khairul, Moh; Sofyan, Hizir; Kruba, Rumaisa; Mardalena, Selvi; Dahlawy, Arriz; Apriliansyah, Feby; Muliadi, Muliadi; Saputra, Dimas Chaerul Ekty; Noviandy, Teuku Rizky; Watsiq Maula, Ahmad
Infolitika Journal of Data Science Vol. 1 No. 2 (2023): December 2023
Publisher : Heca Sentra Analitika

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.60084/ijds.v1i2.130

Abstract

The clustering, one of statistical analysis, can be used for understanding population patterns and as a basis for more targeted policy making. In this ecological study, we explored the population dynamics across 23 districts/cities in Aceh Province. The study used the Aceh Population Development Profile Year 2022 data, focusing on the total population, in-migrants, out-migrants, fertility, and maternal mortality as variables. The study employed descriptive statistics to ascertain the data distribution, followed by the Shapiro-Wilk test to evaluate normality, which is crucial for selecting the appropriate statistical methods. The Spearman test was used to determine correlations between the total population and the variable as indicators. Probabilistic Fuzzy C-Means (PFCM) method is used for clustering. To optimize clustering, the silhouette coefficient was calculated using the Euclidean Distance and the elbow method, with the results analyzed using R-4.3.2 software. This study's design and methods aim to provide a nuanced understanding of demographic patterns for targeted policy-making and regional development in Aceh, Indonesia. Based on the data normality test results, only fertility (p-value = 0.45), while the other variables are not normally distributed. Spearman test was used, and the results showed that only in-migrants (p-value = 1.78 x 10-6) and out-migrants (p-value = 2.30 x 10-6) correlated to the Aceh Province population. Using the population variable and the two variables associated with it, it was found that 4 is the best optimum number of clusters, where clusters 1, 2, 3, and 4 consist of three districts/city, nine districts/city, four districts/city and seven districts/city respectively.
Unraveling Geospatial Determinants: Robust Geographically Weighted Regression Analysis of Maternal Mortality in Indonesia Rahayu, Latifah; Ulfa, Elvitra Mutia; Sasmita, Novi Reandy; Sofyan, Hizir; Kruba, Rumaisa; Mardalena, Selvi; Saputra, Arif
Infolitika Journal of Data Science Vol. 1 No. 2 (2023): December 2023
Publisher : Heca Sentra Analitika

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.60084/ijds.v1i2.133

Abstract

Maternal Mortality Rate (MMR) in Indonesia has experienced a concerning annual increase, reaching 4,627 deaths in 2020 compared to 4,221 in 2019. This upward trajectory underscores the urgency of investigating the factors contributing to MMR. Recognizing the spatial heterogeneity and outliers in the data, our study employs the Robust Geographically Weighted Regression (RGWR) method with the Least Absolute Deviation approach. Using secondary data from the 2020 Indonesian Health Profile publication, the research seeks to establish province-specific models for MMR in 2020 and identify the key influencing factors in each region. Standard regression analyses fall short in addressing the complexities present in the data, making the RGWR approach crucial for understanding the nuanced relationships. The chosen RGWR model utilizes the Least Absolute Deviation method and a fixed kernel exponential weighting function. Notably, this model maintains a consistent bandwidth value across all locations, showcasing its robustness. In evaluating the model variations, the exponential fixed kernel weighting function emerges as the most optimal, boasting the smallest Akaike Information Criterion (AIC) value of 23.990 and the highest coefficient of determination  value of 93.66%. The outcomes of this research yield 24 distinct models, each tailored to the unique characteristics of every province in Indonesia. This nuanced, location-specific approach is vital for developing effective interventions and policies to address the persistently high MMR. By providing insights into the complex interplay of factors influencing maternal mortality in different regions, the study contributes to the groundwork for targeted and impactful public health initiatives across Indonesia.
Decision Tree versus k-NN: A Performance Comparison for Air Quality Classification in Indonesia Sasmita, Novi Reandy; Ramadeska, Siti; Kesuma, Zurnila Marli; Noviandy, Teuku Rizky; Maulana, Aga; Khairul, Mhd; Suhendra, Rivansyah
Infolitika Journal of Data Science Vol. 2 No. 1 (2024): May 2024
Publisher : Heca Sentra Analitika

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.60084/ijds.v2i1.179

Abstract

Air quality can affect human health, the environment, and the sustainability of ecosystems, so efforts are needed to monitor and control air quality. The Plume Air Quality Index (PAQI) is one of the indices to measure and determine the level of air quality. In measuring the accuracy of the air quality level, it is necessary to do the right classification. Some previous studies have conducted classification analysis using the decision tree and K-Nearest Neighbor (k-NN) methods, but only evaluated using accuracy values. Therefore, this study uses both methods to evaluate the results of air quality level classification not only with accuracy but also with precision, recall, and F1-score. Secondary data of pollutant concentration values and PAQI categories based on particulate matter (PM2.5 and PM10), nitrogen dioxide (NO2), and ozone (O3) derived from Plume Labs for 33 provincial capitals in Indonesia in the time period from July 1 to December 31, 2022, were used in this study. From the results of comparing the performance of the two methods, it is found that the decision tree has a greater performance value than the performance value of k-NN. The decision tree performance values for accuracy, precision, recall and F1-score are 90.67%, 90.61%, 90.67%, and 90.63%, respectively. So, it can be concluded that the decision tree performs better than k-NN in classifying PAQI categories with better overall evaluation metric values.
Forecasting Upwelling Phenomena in Lake Laut Tawar: A Semi-Supervised Learning Approach Ulhaq, Muhammad Zia; Farid, Muhammad; Aziza, Zahra Ifma; Nuzullah, Teuku Muhammad Faiz; Syakir, Fakhrus; Sasmita, Novi Reandy
Infolitika Journal of Data Science Vol. 2 No. 2 (2024): November 2024
Publisher : Heca Sentra Analitika

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.60084/ijds.v2i2.211

Abstract

The current climate change is causing the upwelling phenomenon to occur frequently in lakes and reservoirs. As a result of this phenomenon, thousands of fish die, causing floating net cage fish farmers to suffer losses. From existing studies, temperature sensors are used to determine the current condition of a body of water experiencing upwelling or not. Therefore, this study applies clustering to historical climate data from 2017-2023 using a semi-supervised learning approach that produces two labels: "potential for upwelling" and "no potential for upwelling." In the clustering process, the data is divided into two clusters using K-Means Clustering, and Support Vector Machine (SVM) is chosen to classify them. The performance of the proposed algorithm is expressed with accuracy, precision, recall, and F1-score values of 0.99, 0.995, 0.970, and 0.985, respectively. The analysis results show that this model has excellent performance in identifying upwelling potential. By using this method, information about upwelling potential can be obtained more quickly and accurately, allowing fish farmers to take appropriate preventive measures. This study also shows that the combination of K-Means Clustering and Support Vector Machine (SVM) can be effectively used to analyze historical climate data and generate useful predictions.
Optimizing Energy Consumption Prediction Across the IMT-GT Region Through PCA-Based Modeling Farid, Muhammad; Nuzullah, Teuku Muhammad Faiz; Aklya, Zatul; Nazila, Syifa; Ulhaq , Muhammad Zia; Apriliansyah, Feby; Sasmita, Novi Reandy
Infolitika Journal of Data Science Vol. 3 No. 1 (2025): May 2025
Publisher : Heca Sentra Analitika

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.60084/ijds.v3i1.286

Abstract

This study aims to improve the accuracy of energy consumption prediction in the Indonesia-Malaysia-Thailand Growth Triangle (IMT-GT) region by addressing multicollinearity among independent variables such as energy production (Mtoe), lignite coal production (million tons), crude oil production (million tons), refined oil production (million tons), natural gas production (billion cubic meters), and electricity production (terawatt-hours). By integrating Principal Component Analysis (PCA) with Random Forest (RF), six correlated variables were reduced into two uncorrelated principal components (PC1 and PC2), explaining 80.77% of the data variance. The PCA-RF hybrid model outperformed the standalone Random Forest (RF) model, with an increase in the coefficient of determination (R2) from 0.976 to 0.993. Additionally, it achieved significant reductions in error metrics, with the mean absolute error (MAE) decreasing from 5.811 to 4.169 and the root mean square error (RMSE) dropping from 9.278 to 4.786. These results demonstrate PCA’s effectiveness in isolating dominant drivers such as energy and lignite coal production while improving model stability. The framework provides policymakers with a reliable tool to forecast energy demand and align economic growth with sustainability in fossil fuel-dependent economies.