This study proposes a web-based deepfake detection system that integrates Vision-Based Excitation technology and Transformer-based intelligence, called VERITAS (Vision-based Excitation and Robust Intelligence for Transformer-Assisted Deepfake Detection). The system is designed to automatically detect manipulated images and videos by leveraging the Vision Transformer (ViT) model architecture, equipped with the Grad-CAM mechanism for interpretability of detection results. The study conducted a series of tests to measure the system's performance in various scenarios and ensure its reliability in dealing with various types of input. Load testing results showed that up to 30 simultaneous users, the system can operate with good responsiveness (average response time of 130 ms) without experiencing errors. However, when the number of users reaches 40 or more, the system performance drops drastically with a very high error rate, reflecting limitations in handling server load. Real-world testing showed the system can detect deepfakes with an accuracy of 73.61%, with results varying depending on the quality of the tested images. Furthermore, unit functional testing and coverage analysis demonstrated an excellent test pass rate (85%), with all major functions running smoothly and error handling needed to be fixed in some code sections. Overall, the VERITAS system demonstrates strong potential for web-based deepfake detection, with high reliability under low load and adequate performance in functional testing. However, further optimization is needed to handle higher user loads.
Copyrights © 2026