RAILS: A Robust Adversarial Immune-Inspired Learning System

oleh: Ren Wang, Tianqi Chen, Stephen M. Lindsly, Cooper M. Stansbury, Alnawaz Rehemtulla, Indika Rajapakse, Alfred O. Hero

Format: Article
Diterbitkan: IEEE 2022-01-01

Deskripsi

Adversarial attacks against deep neural networks (DNNs) are continuously evolving, requiring increasingly powerful defense strategies. We develop a novel adversarial defense framework inspired by the adaptive immune system: the Robust Adversarial Immune-inspired Learning System (RAILS). Initializing a population of exemplars that is balanced across classes, RAILS starts from a uniform label distribution that encourages diversity and uses an evolutionary optimization process to adaptively adjust the predictive label distribution in a manner that emulates the way the natural immune system recognizes novel pathogens. RAILS&#x2019; evolutionary optimization process explicitly captures the tradeoff between robustness (diversity) and accuracy (specificity) of the network, and represents a new immune-inspired perspective on adversarial learning. The benefits of RAILS are empirically demonstrated under eight types of adversarial attacks on a DNN adversarial image classifier for several benchmark datasets, including: MNIST; SVHN; CIFAR-10; and CIFAR-10. We find that PGD is the most damaging attack strategy and that for this attack RAILS is significantly more robust than other methods, achieving improvements in adversarial robustness by <inline-formula> <tex-math notation="LaTeX">$\geq 5.62\%, 12.5\%$ </tex-math></inline-formula>, 10.32&#x0025;, and 8.39&#x0025;, on these respective datasets, without appreciable loss of classification accuracy. Codes for the results in this paper are available at <uri>https://github.com/wangren09/RAILS</uri>.