Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Baseline Pruning-Based Approach to Trojan Detection in Neural Networks

Published

Author(s)

Peter Bajcsy, Michael Paul Majurski

Abstract

This paper addresses the problem of detecting trojans in neural networks (NNs) by analyzing how NN accuracy responds to systematic pruning. This study leverages the NN models generated for the TrojAI challenges. Our pruning-based approach (1) detects any deviations from the reference NN models, (2) measures the accuracy of a set of systematically pruned NN models using multiple pruning configurations, and (3) classifies each NN model as clean or poisoned by learning a mapping between accuracy measurements and reference clean or poisoned NN model labels. This work outlines a theoretical and experimental framework for finding the optimal mapping over a large search space of pruning parameters. Based on our experiments using Rounds 1 - 4 TrojAI Challenge datasets, the approach achieves average classification accuracy between 68.51 % and 91.06 %. Reference model graphs and source code are available from GitHub.
Proceedings Title
Proceedings of the International Conference on Learning Representations (ICLR) 2021, Security and Safety in Machine Learning Systems Workshop
Conference Dates
May 3-7, 2021
Conference Location
virtual, MD, US
Conference Title
Security and Safety in Machine Learning Systems Worksho

Keywords

artificial intelligence, trojan attacks, AI model pruning

Citation

Bajcsy, P. and Majurski, M. (2021), Baseline Pruning-Based Approach to Trojan Detection in Neural Networks, Proceedings of the International Conference on Learning Representations (ICLR) 2021, Security and Safety in Machine Learning Systems Workshop, virtual, MD, US (Accessed January 28, 2025)

Issues

If you have any questions about this publication or are having problems accessing it, please contact reflib@nist.gov.

Created May 7, 2021, Updated January 6, 2023