Kejahatan klitih di wilayah Yogyakarta telah menimbulkan kekhawatiran serius bagi pemerintah dan masyarakat, sehingga mengancam keamanan dan kenyamanan publik. Dalam upaya penanganan permasalahan ini, penelitian ini mengajukan solusi implementasi teknologi keamanan yang berfokus pada kamera night vision dan machine learning guna mendeteksi kejahatan klitih dengan efektif, khususnya pada rentang waktu malam. Data yang dikumpulkan untuk penelitian ini terdiri dari 1006 gambar yang direkam dari aksi kejahatan klitih. Proses pengolahan data melibatkan beberapa tahap, dimulai dengan preprocessing di mana seluruh gambar diubah ukurannya menjadi 640x640 piksel. Selanjutnya, dilakukan augmentasi data untuk meningkatkan variasi dan ketangguhan model, berupa rotasi sebesar 90°, crop dengan variasi zoom dari 0% hingga 20%, penambahan noise dilakukan hingga 5%. Hasil penelitian menunjukkan bahwa model YOLOv6 memberikan kinerja terbaik dalam mendeteksi label senjata, dengan akurasi sebesar 0,9 dan F1-score mencapai 0,91. Sementara itu, dalam mengenali kejahatan fisik, YOLOv6 juga menunjukkan performa unggul dengan akurasi 0,63 dan F1-score 0,73. Model Faster R-CNN dan SSD juga memberikan hasil yang baik, namun YOLOv6 mempertahankan dominasi dalam deteksi kejahatan klitih berdasarkan akurasi dan evaluasi metrik lainnya. Pemanfaatan teknologi pendeteksian klitih di masa depan dapat memberikan kontribusi positif dalam menciptakan lingkungan yang lebih aman dan nyaman bagi seluruh masyarakat. Abstract The prevalence of klitih criminal activity within the Yogyakarta region has engendered significant apprehension among governmental authorities and the public alike, thereby posing a substantial risk to community safety and well-being. In an effort to address this pressing concern, the present research advocates for the deployment of advanced security technologies, specifically emphasizing the utilization of night vision surveillance cameras in conjunction with machine learning algorithms to proficiently identify klitih-related offenses, particularly during nocturnal hours. The dataset utilized for this investigation comprises 1,006 photographic images obtained from various klitih crime occurrences. The data processing procedures encompassed multiple phases, commencing with preprocessing wherein all images were standardized to dimensions of 640x640 pixels. Subsequently, data augmentation techniques were employed to bolster the diversity and resilience of the model, incorporating transformations such as 90° rotations, cropping with zoom variations ranging from 0% to 20%, and the introduction of noise levels of up to 5%. The findings of this study indicate that the YOLOv6 model exhibited the most favorable performance in the detection of weapon classifications, achieving an accuracy rate of 0.9 and an F1-score of 0.91. Furthermore, in the context of identifying physical crimes, YOLOv6 similarly showcased outstanding efficacy, attaining an accuracy of 0.63 and an F1-score of 0.73. Although the Faster R-CNN and SSD models yielded commendable results, YOLOv6 sustained its preeminence in the realm of klitih crime detection, as evidenced by its superior accuracy and other evaluative metrics. The prospective implementation of klitih detection technology holds the potential to make a constructive impact in fostering a safer and more secure environment for the entire community.