cover
Contact Name
Teguh Wahyono
Contact Email
indexsasi@apji.org
Phone
+6282226535471
Journal Mail Official
indexsasi@apji.org
Editorial Address
Jl. Radin Inten II no.53 A. RT 7/RW 14, Duren Sawit, Kec. Duren Sawit, Kota Jakarta Timur, DKI Jakarta, 13440
Location
Unknown,
Unknown
INDONESIA
Big Data Analytics and Data Science
ISSN : -     EISSN : 31239986     DOI : 10.66472
Core Subject :
Aims This journal aims to publish cutting-edge research in big data analytics and data science, emphasizing data-driven methods and intelligent analytics for decision support and innovation. Scope Big data architectures and platforms Data mining and predictive analytics Machine learning for data analytics Data visualization and visual analytics Structured, semi-structured, and unstructured data processing Business intelligence and data-driven decision-making Ethical, privacy, and governance aspects of data science
Arjuna Subject : -
Articles 10 Documents
A Framework for Scalable Big Data Analytics and Workflow Orchestration in Heterogeneous Cloud Native Software Platforms for Smart Cities Amelia Contesa; Pratiwi Rachmadi; Aziz Azindani
Big Data Analytics and Data Science Vol. 1 No. 1 (2026): March: Big Data Analytics and Data Science
Publisher : Asosiasi Pengelola Jurnal Informatika dan Komputer Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.66472/bdas.v1i1.18

Abstract

Smart cities are increasingly leveraging advanced technologies such as the Internet of Things (IoT), Artificial Intelligence (AI), and Big Data Analytics to optimize urban management and improve the quality of life for citizens. However, managing vast and diverse datasets from numerous sources in real-time presents several challenges. This research proposes a modular framework that integrates distributed data processing engines with container-based workflow orchestration to address scalability, latency, adaptability, and fault tolerance in smart city data analytics. The framework utilizes cloud native technologies, including Apache Spark and Kubernetes, to efficiently manage resources and ensure high availability. The experimental setup tested the framework’s ability to handle dynamic data loads, demonstrating scalability through real-time resource allocation and low-latency processing. The adaptability of the framework was evident in its seamless integration with various data sources, such as environmental sensors and traffic management systems, which require different processing methods. Additionally, the framework’s modularity provided fault tolerance, enabling continued operation even if individual components failed, a crucial feature for mission-critical applications in smart cities. Compared to traditional monolithic systems, the proposed framework outperformed in flexibility, scalability, and performance, offering significant improvements in handling real-time data streams. Despite these advantages, challenges remain, particularly in integrating heterogeneous data formats and optimizing real-time processing for high-priority applications. The research highlights the importance of scalable data analytics and efficient workflow orchestration for the future of smart city platforms, offering a foundation for the development of more resilient, adaptable, and efficient cloud native infrastructures.
Evaluating Explainable Artificial Intelligence Methods for Interpretable Machine Learning Models in Large Scale Enterprise Data Analytics Systems Indra Ava Dianta; Greget Widhiati; Andreas Tigor Oktaga
Big Data Analytics and Data Science Vol. 1 No. 1 (2026): March: Big Data Analytics and Data Science
Publisher : Asosiasi Pengelola Jurnal Informatika dan Komputer Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.66472/bdas.v1i1.19

Abstract

Explainable Artificial Intelligence (XAI) has become a critical area of research within artificial intelligence, focusing on improving the transparency and interpretability of machine learning (ML) models, often referred to as "black-box" models. The need for XAI techniques arises from the inherent complexity of ML models, which can make their decision-making processes difficult for users to understand. This study investigates various XAI techniques, including LIME (Local Interpretable Model-agnostic Explanations) and SHAP (SHapley Additive exPlanations), to assess their impact on model interpretability without significantly compromising predictive performance. A comparative experimental design was used, applying these XAI methods to different ML models, including deep neural networks and ensemble methods, within large-scale enterprise data analytics systems. The results indicate that XAI methods significantly enhance model transparency and decision traceability, allowing users to understand the influence of individual features on predictions. While a slight reduction in predictive accuracy was observed, especially with simpler models, the trade-off between interpretability and performance was deemed acceptable, particularly in fields requiring transparency, such as healthcare, finance, and autonomous systems. The use of XAI in enterprise data systems has practical implications for fostering trust and enabling informed decision-making among stakeholders. Furthermore, the study discusses the challenges and limitations of applying XAI techniques, such as complexity, scalability, and model-specific limitations. Future research is suggested to focus on developing more scalable and efficient XAI methods, enhancing their applicability across various model types, and addressing the challenges of real-time applications. This will be crucial in ensuring the widespread adoption of XAI in critical domains, promoting the ethical use of AI while maintaining predictive accuracy.
Designing Robust Data Quality Governance Strategies for Distributed Software Systems : Integrating Real Time Monitoring and Automated Anomaly Detection Imam Rangga Bakti; Yola Permata Bunda; Mohammad Muhsin
Big Data Analytics and Data Science Vol. 1 No. 1 (2026): March: Big Data Analytics and Data Science
Publisher : Asosiasi Pengelola Jurnal Informatika dan Komputer Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.66472/bdas.v1i1.21

Abstract

Distributed software systems face significant challenges related to data quality due to their complex, decentralized architecture. These systems often involve multiple nodes responsible for processing and storing data, making it difficult to maintain consistency and ensure accurate data across the entire network. In particular, issues like data inconsistency, latency, and data fragmentation are prevalent in distributed environments. To address these challenges, this study proposes an integrated data quality governance strategy that combines real time monitoring and automated anomaly detection using machine learning models. The proposed strategy aims to improve data consistency, enhance anomaly detection capabilities, and reduce the need for manual intervention, ultimately improving overall data governance in distributed systems. Real time monitoring ensures immediate identification of data issues as they occur, while machine learning models, such as autoencoders and Isolation Forests, automate the detection of anomalies based on high reconstruction errors and data isolation techniques. The study evaluates the proposed strategy through real-world distributed system scenarios, comparing its effectiveness to traditional approaches like periodic audits and manual validation. Results demonstrate that the integrated approach leads to faster anomaly detection, reduced data inconsistencies, and improved overall system performance. The use of advanced machine learning techniques and real time analytics significantly enhances the system's ability to maintain high data quality standards across multiple distributed nodes. This strategy has wide-ranging implications for industries that rely on distributed systems, such as finance, healthcare, and IoT, where data integrity is essential for operational success. Future research can focus on integrating more advanced machine learning techniques and optimizing the real time monitoring framework to handle larger and more complex systems.
Optimizing End to end Machine Learning Pipelines Using Hybrid Edge Cloud Architectures for Real Time Decision making Applications Asro Asro; Solihin Solihin; Irlon Irlon
Big Data Analytics and Data Science Vol. 1 No. 1 (2026): March: Big Data Analytics and Data Science
Publisher : Asosiasi Pengelola Jurnal Informatika dan Komputer Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.66472/bdas.v1i1.22

Abstract

Real time decision making applications, such as those used in autonomous vehicles, smart cities, and industrial IoT, require fast, scalable, and accurate analytics to ensure timely responses and optimized operations. Traditional cloud-based systems face significant challenges in meeting these requirements due to high latency, limited scalability, and bottlenecks in data processing. This study explores the use of a hybrid Edge Cloud architecture to optimize End to end machine learning (ML) pipelines for real time applications. The proposed system offloads time-sensitive tasks to edge devices, while computationally intensive processes are handled by the cloud, ensuring efficient use of resources and reduced latency. Experimental results demonstrate that the hybrid model reduces inference latency by up to 70% compared to cloud-only systems, while maintaining model accuracy and increasing throughput. Additionally, the scalability of the hybrid architecture is highlighted, as it can handle large-scale data streams and adapt to varying workloads. The findings show that hybrid Edge Cloud architectures are well-suited for applications where fast decision making is critical, such as autonomous systems and real time analytics in smart cities. However, challenges remain in managing resources across edge and cloud systems, particularly in balancing computational loads and ensuring system reliability. Future research should focus on optimizing task partitioning, integrating advanced edge AI models, and exploring the use of 5G networks to enhance performance further. Overall, the study demonstrates the potential of hybrid Edge Cloud systems in overcoming the limitations of traditional cloud-based ML pipelines and provides insights into the future of real time data processing.
Adaptive Cyber Secure Software Engineering Practices for Big Data Platforms With Dynamic Access Control and Differential Privacy Mechanisms Ahmad Budi Trisnawan; Priyo Wibowo
Big Data Analytics and Data Science Vol. 1 No. 1 (2026): March: Big Data Analytics and Data Science
Publisher : Asosiasi Pengelola Jurnal Informatika dan Komputer Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.66472/bdas.v1i1.24

Abstract

Big data platforms face significant challenges related to cybersecurity and privacy due to the vast volume, variety, and velocity of data they manage. Traditional static security measures often fail to address the dynamic and complex nature of big data environments. This research proposes an adaptive cybersecurity framework that integrates dynamic access control and differential privacy mechanisms to enhance both the security and privacy of big data platforms. The dynamic access control mechanism continuously adjusts access permissions in real-time based on changing risk and trust levels, ensuring that sensitive data remains secure even as user roles and data flows evolve. The differential privacy mechanism adds noise to data, preserving individual privacy while allowing for meaningful data analysis. Through simulations and case studies, the framework was evaluated in various real-world environments, including healthcare, IoT, and finance, where it demonstrated scalability, efficiency, and robust security performance. The results showed that the proposed framework significantly reduced unauthorized access attempts and maintained data privacy, while still enabling effective data analysis. Although there were some challenges regarding performance overhead, particularly in resource-constrained environments, the framework remained effective in large-scale systems. The findings highlight the importance of adaptive security practices in big data environments and suggest that future research should focus on refining dynamic security mechanisms and applying differential privacy in diverse real-world scenarios. These advancements are essential for ensuring that big data platforms can handle evolving cyber threats without compromising data utility or privacy.
Development of an Early Warning System for Predicting Drug Shortages Using a Random Forest Algorithm on Hospital Pharmacy Logistics Data Ahmad Asyhadi Asyhadi; Widyadhana Candraningtias
Big Data Analytics and Data Science Vol. 1 No. 2 (2026): June: Big Data Analytics and Data Science
Publisher : Asosiasi Pengelola Jurnal Informatika dan Komputer Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.66472/bdas.v1i2.390

Abstract

Stock shortages are a significant issue in inventory management that can disrupt operations and pose a risk of loss. Therefore, an early warning system capable of detecting potential risks at an earlier stage is necessary. This study aims to develop a machine learning-based prediction model to detect risk conditions using the Random Forest algorithm and compare it with several other classification models. To address the issue of data imbalance, the Synthetic Minority Oversampling Technique (SMOTE) was applied to the training data. Additionally, feature engineering was performed by creating a usage ratio variable as an indicator of the relationship between inventory and usage. The dataset consists of 882 data points with an imbalanced class distribution, where the risk class is more dominant than the normal class. Model evaluation was performed using the F2-score metric, which places greater emphasis on recall, given the importance of minimizing false negatives in early warning systems. Furthermore, model performance was also analyzed using ROC curves and Precision-Recall Curves to measure the model’s discriminatory ability more comprehensively. A high AUC value indicates that the model is effective at distinguishing between the normal and risk classes, particularly under imbalanced data conditions. To improve risk detection sensitivity, a threshold tuning approach was employed by adjusting the probability decision threshold based on F2-score optimization. This approach aims to increase the recall value so that all at-risk cases can be detected to the greatest extent possible, albeit with a potential increase in false positives. The research results show that the developed model is capable of achieving very high performance, with an optimal F2-score and no classification errors found in the test data. Feature importance analysis indicates that the stock, usage, and usage ratio variables are dominant factors in determining risk conditions. Nevertheless, these very high results need to be analyzed critically due to the interdependence between features and the target label formation process. Overall, this study contributes to the development of a machine learning-based early warning system that focuses not only on accuracy but also on comprehensive risk detection capabilities. The proposed approach can be used as a decision support system for more proactive inventory management.
Artificial Intelligence-Based Early Warning System for Disaster Management: A Literature Review Systematic and Bibliometric Analysis Ridwan Zulkifli; Zainal Arifin Hasibuan; Irawan Afrianto; Bella Hardiyana; Sri Supatmi
Big Data Analytics and Data Science Vol. 1 No. 2 (2026): June: Big Data Analytics and Data Science
Publisher : Asosiasi Pengelola Jurnal Informatika dan Komputer Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.66472/bdas.v1i2.392

Abstract

The increasing frequency and intensity of natural disasters globally demands the development of more accurate and responsive Early Warning Systems (EWS). In recent years, Artificial Intelligence (AI) has been increasingly applied in natural disaster mitigation, but the approaches used are still diverse and spread across various domains. This study aims to present a systematic literature review on the application of AI and deep learning in natural disaster early warning systems. This review was conducted following the PRISMA 2020 guidelines by analyzing literature published during the 2020–2025 period. The selection process resulted in 102 studies meeting the inclusion criteria, with 30 full-text articles being analyzed in depth to map disaster types, AI methods, data sources, and characteristics of early warning systems developed in various regions, including Asia and Africa. The review results show the dominance of deep learning approaches, particularly time series-based models such as Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU), particularly in flood forecasting and land deformation prediction. More advanced architectures, such as Transformer, are beginning to be adopted to capture long-term temporal patterns, while the combination of convolutional neural networks (CNN) with remote sensing data is widely used for spatial mapping of disaster events. Furthermore, the integration of sensor data and the Internet of Things (IoT) shows potential in supporting more responsive early warning systems. However, most research remains limited to the modeling or simulation stage, with little discussion of the real-time and operational implementation of EWS. This review highlights the gap between AI model development and the implementation of reliable early warning systems and provides a conceptual foundation for the future development of more integrated AI-based disaster mitigation systems.
Classification, Prediction, and Prescription of Digital Government Governance Maturity Levels: Leveraging SPBE Index Data (2019–2024) for Evidence-Based Regional Digital Government Architecture Planning in Indonesia Andi Agus Salim; Zainal Arifin Hasibuan; Agus Nursikuwagus; Sri Supatmi
Big Data Analytics and Data Science Vol. 1 No. 2 (2026): June: Big Data Analytics and Data Science
Publisher : Asosiasi Pengelola Jurnal Informatika dan Komputer Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.66472/bdas.v1i2.409

Abstract

Indonesia's transition from the SPBE evaluation framework to the 2025–2029 Pemdi (Digital Government) Index marks a strategic shift toward comprehensive governance maturity. However, regional governments face significant challenges in strategic planning due to the absence of empirical models linking historical SPBE performance to future Pemdi trajectories and a lack of data-driven guidance for prioritizing governance interventions. This research aims to develop an integrated Classification-Prediction-Prescription (CPP) framework to classify, forecast, and prescribe regional digital government governance maturity levels. The proposed methodology employs machine learning algorithms (Random Forest and Gradient Boosting) to conduct multi-class classification (five maturity levels) and regression (continuous score prediction) using longitudinal SPBE data (2019–2024) from 548 Indonesian regional governments. This quantitative approach is complemented by feature importance analysis and scenario-based simulations to generate actionable insights. The models are projected to achieve over 85% classification accuracy and a regression RMSE of under 0.5. The synthesis of main findings reveals that indicators within the policy and architecture planning domains are the strongest predictors driving maturity progression. Furthermore, the study segments regional governments into four distinct trajectory clusters and formulates a tailored prescriptive recommendation matrix across multiple planning horizons. In conclusion, the CPP framework effectively translates national evaluation data into actionable intelligence, empowering regional governments to optimize resource allocation, prioritize high-impact interventions, and systematically align their digital transformation pathways with formal planning documents such as the RPJMD and Regional Action Plans.
Transforming the Global Aquaculture Supply Chain through the Integration of Artificial Intelligence and Big Data for Overcome Asymmetry Information Hernalom Sitorus; Zaenal Arifin Hasibuan; Bobi Kurniawan; Sri Supatmi
Big Data Analytics and Data Science Vol. 1 No. 2 (2026): June: Big Data Analytics and Data Science
Publisher : Asosiasi Pengelola Jurnal Informatika dan Komputer Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.66472/bdas.v1i2.443

Abstract

The global aquaculture sector faces structural challenges in the form of information asymmetry that causes a misalignment between production and market demand. The still-dominant production-driven paradigm leads to supply chain inefficiencies, low transparency, and limited traceability. This research aims to develop an information system integration model based on Artificial Intelligence (AI) and Big Data to transform the supply chain into a market-driven one. The research uses the Design Science Research (DSR) method, which includes needs analysis, data integration architecture design, development of Machine Learning and Deep Learning-based predictive models, and evaluation through prototype implementation. Expected outcomes include a data integration architecture, a supply-demand prediction model, and an AI-based traceability framework. This research contributes to improving the efficiency, transparency, and global competitiveness of the aquaculture sector.
Customer Perception Analysis of Chocolate Candy Products Through Packaging Influence Using K-Prototypes Clustering Dafinah Ramadhani; Aniq Farichatus Zahwa
Big Data Analytics and Data Science Vol. 1 No. 2 (2026): June: Big Data Analytics and Data Science
Publisher : Asosiasi Pengelola Jurnal Informatika dan Komputer Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.66472/bdas.v1i2.504

Abstract

Kemasan memiliki dampak signifikan terhadap bagaimana konsumen mempersepsikan suatu barang dan bagaimana barang tersebut diidentifikasi dan dinilai. Di sektor permen, pelanggan dapat menafsirkan produk yang sama secara berbeda tergantung pada isi dan komponen visual pada kemasan. Tujuan penelitian ini adalah untuk mendefinisikan kategori pelanggan berdasarkan kriteria persepsi yang terkait dengan kemasan dan menilai persepsi konsumen terhadap produk permen cokelat. Pendekatan pengelompokan K-Prototypes, yang bekerja dengan baik pada data campuran yang mencakup variabel numerik dan kategorikal, digunakan dalam penelitian ini. Persepsi dengan kemasan, persepsi tanpa kemasan, tingkat kepercayaan persepsi, kebiasaan membaca kemasan, dan pengaruh kemasan yang dirasakan termasuk di antara faktor-faktor yang diteliti. Pra-perlakuan data, termasuk pembersihan data, pemilihan fitur, dan normalisasi, dilakukan sebelum pengelompokan. Metode Elbow digunakan untuk menemukan jumlah cluster yang ideal, dan Silhouette Score, Davies-Bouldin Index, dan Calinski-Harabasz Index digunakan untuk menilai hasil pengelompokan. Tiga kategori konsumen yang berbeda diidentifikasi oleh hasil tersebut: Persepsi Tidak Pasti (Pembaca Tanpa Kemasan), Persepsi Benar (Permen), dan Persepsi Salah (Cokelat). Hasil penelitian menunjukkan bahwa perbedaan persepsi konsumen terhadap produk sebagian besar dipengaruhi oleh kemasan. Kategorisasi dan identifikasi produk oleh pelanggan terbukti dipengaruhi oleh perbedaan persepsi kemasan, tingkat kepercayaan, dan perilaku membaca kemasan. Temuan ini menunjukkan bahwa kemasan merupakan alat komunikasi yang efisien yang memengaruhi persepsi pelanggan, selain juga berfungsi sebagai komponen pelindung. Berdasarkan karakteristik persepsi konsumen, penelitian ini menawarkan wawasan yang berguna untuk meningkatkan desain kemasan dan menciptakan taktik komunikasi pemasaran yang lebih sukses.

Page 1 of 1 | Total Record : 10