The top proof feature corresponding to DNNs trained using different methods rely on different input features. 2021). In this section, we interpret proof features obtained with SuPFEx and use these interpretations to qualitatively check whether the dissimilarities are also evident in the invariants captured by the different proofs of the same robustness property on standard and robustly trained networks. We also study the effect of certified robust training methods like CROWN-IBP (Zhang et al., 2020), empirically robust training methods like PGD (Madry et al., 2018) and training methods that combine both adversarial and certified training like COLT (Balunovic & Vechev, 2020) on the proof features. For a local input region φ, we say that a robustness proof is semantically meaningful if it focuses on the relevant features of the output class for images contained inside φ and not on the spurious features. In the case of MNIST or CIFAR-10 images, spurious features are the pixels that form a part of the background of the image, whereas important features are the pixels that are a part of the actual object being identified by the network. Gradient map of the extracted proof features w.r.t. to the input region φ gives us an idea of the input pixels that the network focuses on. We obtain the gradient maps by calculating the mean gradient over 100 uniformly drawn samples from φ as described in Section 4.3. As done in (Tsipras et al., 2019), to avoid introducing any inherent bias in proof feature visualization, no preprocessing (other than scaling and clipping for visualization) is applied to the gradients obtained for each individual sample. In Fig. 2, we compare the gradient maps corresponding to the top proof feature (the one having the highest prior-

Source publication

Figure 1. Distribution of the size of the proof feature set computed by...

Figure 2. The top proof feature corresponding to DNNs trained using...

Figure 3. Visualization of gradients of the top proof feature for PGD...

Figure 6. Visualization of gradient maps of top-4 proof features...

Figure 8. Comparing gradients of the top proof features retained by...

Interpreting Robustness Proofs of Deep Neural Networks

Preprint

Full-text available

Jan 2023

In recent years numerous methods have been developed to formally verify the robustness of deep neural networks (DNNs). Though the proposed techniques are effective in providing mathematical guarantees about the DNNs behavior, it is not clear whether the proofs generated by these methods are human-interpretable. In this paper, we bridge this gap by...

Context 1

... Fig. 2, we compare the gradient maps corresponding to the top proof feature (the one having the highest prior-ity P ub (F ni )) on networks from Table 1 on representative images of different output classes in the MNIST and CI-FAR10 test sets. The experiments leads us to interesting observations -even if some property is verified for both the ...

View in full-text

Context 2

... Gradient maps generated on MNIST networks (b) Gradient maps generated on CIFAR-10 networks Figure 4. Additional plots for the top proof feature visualization (in addition to Fig. 2) -Visualization of gradient map of top proof feature (having highest priority) generated for networks trained with different training methods. It is evident that the top proof feature corresponding to the standard network highlights both relevant and spurious input features. In contrast, the top proof feature of the provably robust ...

View in full-text

Schematic structure of the BigGAN-deep-256 generator network...

Images generated by neural networks with weights given by (3),...

Selected images generated by neural networks obtained through various...

Selected generated images from Fig. 2 juxtaposed with perceptually...

Images generated by neural networks with weights given by (4),...

Aesthetics and neural network image representations

Article

Full-text available

Jul 2023

Romuald A. Janik

We analyze the spaces of images encoded by generative neural networks of the BigGAN architecture. We find that generic multiplicative perturbations of neural network parameters away from the photo-realistic point often lead to networks generating images which appear as “artistic renditions” of the corresponding objects. This demonstrates an emergen...

Contexts in source publication

Similar publications