Accelerated adversarial attack generation and enhanced decision insight

Loading...
Thumbnail Image

Date

2023-12-07

Journal Title

Journal ISSN

Volume Title

Publisher

Information Technology Research Unit, Faculty of Information Technology, University of Moratuwa.

Abstract

Adversarial Attack is a rapidly growing field that studies how intentionally crafted inputs can fool machine learning models. This can have severe implications for the security of machine learning systems, as it can allow attackers to bypass security measures and cause the system to malfunction. Finding solutions for these attacks involves creating specific attack scenarios using a particular dataset and training a model based on that dataset. Adversarial attacks on a trained model can significantly reduce accuracy by manipulating the decision boundary, causing instances initially classified correctly to be misclassified. This alteration results in a notable decline in the model's ability to classify instances after an attack accurately. The above process helps us develop strategies to defend against these attacks. However, a significant challenge arises because generating these attack scenarios for a specific dataset is time-consuming. Moreover, the disparity between the model's prediction outcomes before and after the attack tends to lack clear interpretability. In both above limitations, the common limiting factor is time. The time it takes to devise a solution is crucial because the longer it takes, the more opportunity an attacker has to cause harm in real-world situations. In this paper, we propose two approaches to address the above gaps: minimizing the time required for attack generation using data augmentation and understanding the effects of an attack on the model's decision-making process by generating more interpretable descriptions. We show that description can be used to gain insights into how an attack affects the model's decision-making process by identifying the most critical features for the model's prediction before and after the attack. Our work can potentially improve the security of machine learning systems by making it more difficult for attackers to generate effective attacks.

Description

Keywords

Adversarial machine learning, Adversarial attack, Explainable AI

Citation

DOI

Collections