Claim Missing Document
Check
Articles

Found 2 Documents
Search

A dilution-based defense method against poisoning attacks on deep learning systems Park, Hweerang; Cho, Youngho
International Journal of Electrical and Computer Engineering (IJECE) Vol 14, No 1: February 2024
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijece.v14i1.pp645-652

Abstract

Poisoning attack in deep learning (DL) refers to a type of adversarial attack that injects maliciously manipulated data samples into a training dataset for the purpose of forcing a DL model trained based on the poisoned training dataset to misclassify inputs and thus significantly degrading its performance and reliability. Meanwhile, a traditional defense approach against poisoning attacks tries to detect poisoned data samples from the training dataset and then remove them. However, since new sophisticated attacks avoiding existing detection methods continue to emerge, a detection method alone cannot effectively counter poisoning attacks. For this reason, in this paper, we propose a novel dilution-based defense method that mitigates the effect of poisoned data by adding clean data to the training dataset. According to our experiments, our dilution-based defense technique can significantly decrease the success rate of poisoning attacks and improve classification accuracy by effectively reducing the contamination ratio of the manipulated data. Especially, our proposed method outperformed an existing defense method (Cutmix data augmentation) by 20.9%p at most in terms of classification accuracy.
Reliable backdoor attack detection for various size of backdoor triggers Rah, Yeongrok; Cho, Youngho
IAES International Journal of Artificial Intelligence (IJ-AI) Vol 14, No 1: February 2025
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijai.v14.i1.pp650-657

Abstract

Backdoor attack techniques have evolved toward compromising the integrity of deep learning (DL) models. To defend against backdoor attacks, neural cleanse (NC) has been proposed as a promising backdoor attack detection method. NC detects the existence of a backdoor trigger by inserting perturbation into a benign image and then capturing the abnormality of inserted perturbation. However, NC has a significant limitation such that it fails to detect a backdoor trigger when its size exceeds a certain threshold that can be measured in anomaly index (AI). To overcome such limitation, in this paper, we propose a reliable backdoor attack detection method that successfully detects backdoor attacks regardless of the backdoor trigger size. Specifically, our proposed method inserts perturbation to backdoor images to induce them to be classified into different labels and measures the abnormality of perturbation. Thus, we assume that the amount of perturbation required to reclassify the label of backdoor images to the ground-truth label will be abnormally small compared to them for other labels. By implementing and conducting comparative experiments, we confirmed that our idea is valid, and our proposed method outperforms an existing backdoor detection method (NC) by 30%p on average in terms of backdoor detection accuracy (BDA).