摘要
A poisoning attack method manipulating the training of a model is easily to be detected since the general performance of a model is downgraded. Although a backdoor attack only misleads the decisions on the samples with a trigger but not any samples, the strong asso-ciation between the trigger and the class ID exposes the attack. The weak concealment lim-its the damage of current poisoning attacks to machine learning models. This study proposes a poisoning attack against deep neural networks, aiming to not only reduce the robustness of a model against adversarial samples but also explicitly increase its conceal-ment, defined as the accuracy of the contaminated model on untainted samples. In order to improve the efficiency of poisoning sample generation, we propose training interval, gra-dient truncation, and parallel process mechanisms. As a result, the model trained on the poisoning samples generated by our method is easily misled by slight crafting, and the attack is difficult to be detected since the contaminated model performs well on clean sam-ples. The experimental results show that our method significantly increases the attack suc-cess rate without a substantial drop in classification accuracy on clean samples. The transferability and instability of our model are confirmed experimentally.
-
单位佛山科学技术学院