Satriawan Desmana
Cyber Security Engineering, Politeknik Negeri Cilacap, Indonesia

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Implementing Proxmox VE-Based High Availability Clustering with Ceph Replication and Performance Testing for Resilient IT Infrastructure in High-Risk Disaster Areas Muhammad Abdul Muin; Rahmawan Bagus Trianto; Muhammad Nur Faiz; Ratih Hafsarah Maharrani; Satriawan Desmana
Jurnal Teknik Informatika (Jutif) Vol. 7 No. 3 (2026): JUTIF Volume 7, Number 3, June 2026
Publisher : Informatika, Universitas Jenderal Soedirman

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.52436/1.jutif.2026.7.3.5591

Abstract

IT infrastructure in disaster-prone areas, particularly along Java's southern coastal region within the Sunda Arc subduction zone, faces significant vulnerability to seismic events and tsunamis that cause critical system downtime, disrupting emergency coordination and exacerbating disaster impacts. This study aims to develop and validate an open-source High Availability (HA) solution using Proxmox Virtual Environment (PVE) ensuring service continuity with Recovery Time Objective (RTO) under 2 minutes and near-zero Recovery Point Objective (RPO). The methodology encompasses four systematic stages: needs analysis identifying infrastructure requirements and disaster risk assessment for Cilacap region; architecture design implementing three-node PVE cluster with Ceph distributed storage (replication factor 3) and Corosync quorum mechanism; system implementation including network bonding, VLAN segmentation, and dedicated 1Gbps Ceph replication network; and comprehensive performance testing through fault injection scenarios (power-off simulation, network partition, storage failure) measuring inter-node latency, disk I/O performance, and failover recovery metrics. Results demonstrate exceptional reliability with 99.92% availability over 72-hour monitoring, Mean Time Between Failures (MTBF) of 24.1 hours, and Mean Time To Recovery (MTTR) of 70 seconds with total downtime of 3.53 minutes across three failover simulations. Inter-node latency remains below 1ms (average 0.372-0.593ms), while disk I/O latency maintains sub-0.5ms performance during failover events. This research contributes to computer science and disaster informatics by providing a validated, replicable open-source blueprint for resilient IT infrastructure in Indonesia's disaster-prone regions, offering practical implementation pathways for integration with national emergency systems including BNPB coordination networks and BMKG early warning infrastructure.