GAN Dissection: Visualizing and Understanding Generative Adversarial Networks

David Bau1,2, Jun-Yan Zhu1, Hendrik Strobelt2,3, Bolei Zhou4, Joshua B. Tenenbaum1, William T. Freeman1, Antonio Torralba1,2
1MIT CSAIL, 2MIT-IBM Watson AI Lab, 3IBM Research, 4The Chinese University of Hong Kong

Historical note: this paper's "GAN Paint" was one of the first image editing tools with GANs, showing for the first time that a user can directly manipulate neuron activations to paint visual concepts on an image.

In Proceedings of the National Academy of Sciences, Sep 2020, we update the methods and unify analysis with classifiers.

An animation showing interactions with synthesized images of buildinigs, scribbling on areas to add trees, add a dome to the top of a building, remove grass from the ground.

The #GANpaint app works by directly activating and deactivating sets of neurons in a deep network trained to generate images. Each button on the left ("door", "brick", etc) corresponds to a set of 20 neurons. The app demonstrates that, by learning to draw, the network also learns about objects such as trees and doors and rooftops. By switching neurons directly, you can observe the structure of the visual world that the network has learned to model. (Try it here.)

Why Painting with a GAN is Interesting

A computer could draw a scene in two ways:

  1. It could compose the scene out of objects it knows.
  2. Or it could memorize an image and replay one just like it.

In recent years, innovative Generative Adversarial Networks (GANs, I. Goodfellow, et al, 2014) have demonstrated a remarkable ability to create nearly photorealistic images. However, it has been unknown whether these networks learn composition or if they operate purely through memorization of pixel patterns.

Our GAN Paint demo and our GAN Dissection method provide evidence that the networks have learned some aspects of composition.

Unit 365 of layer4 of a church Progressive GAN (T. Karras, et al, 2018) draws trees.
Unit 43 draws domes.
Unit 14 draws grass.
Unit 276 draws towers.

What is GAN Dissection?

Our paper describes a framework for visualizing and understanding the structure learned by a generative network. GAN dissection allows us to ask:

  1. Does the network learn internal neurons that match meaningful concepts?
  2. Do these sets of neurons merely correlate with objects, or does the GAN use those neurons to reason about objects?
  3. Can causal neurons be manipulated to improve the output of a GAN?

Dissection uses a segmentation network (T. Xiao, et al, 2018) along with a dissection method (D. Bau, et al, 2017) to find individual units of the generator that match meaningful object classes, like trees.

The neurons that a GAN learns depend on the type of scene it learns to draw:
A business suit neuron appears when learning conference rooms, and a stove neuron appears when drawing kitchens.

What Does Each Neuron Control?

To verify that sets of neurons control the drawing of objects rather than merely correlating, we intervene in the network and activate and deactivate neurons directly.

One surprising finding is that the same neurons control a specific object class in a variety of contexts, even if the final appearance of the object varies widely. The same neurons can switch on the concept of a "door" even if a big stone wall requires a big heavy door facing to the left, or a little hut requires a small curtain door facing to the right.

The network also understands when it can and cannot compose objects. For example, turning on neurons for a door in the proper location of a building will add a door. But doing the same in the sky or on a tree will typically have no effect. This structure can be quantified.

Above: yellow boxes highlight the locations of neurons that can be switched on to add a door. One way to make a big door as in (d) is to emphasize a smaller door, but there are many places where the GAN will refuse to draw a door. Below: this is why GAN Paint is not like an ordinary paint program. It does not always do what you want - it wants objects to go in the right place.

Can GAN Mistakes be Debugged and Fixed?

One reason it is important to understand the internal concepts of a network is that the insights can help us improve the network's behavior.

For example, a GAN will sometimes generate terribly unrealistic images, and the cause of these mistakes has been previously unknown. We have identified that these mistakes can be triggered by specific sets of neurons that cause the visual artifacts.

By identifying and silencing those neurons, we can improve the the quality of the output of a GAN.

Acknowledgments: We would like to thank Zhoutong Zhang, Guha Balakrishnan, Didac Suris, Adrià Recasens and Zhuang Liu for valuable discussions. We are grateful for the support of the MIT-IBM Watson AI Lab, the DARPA XAI program FA8750-18-C000, NSF 1524817 on Advancing Visual Recognition with Feature Visualizations, NSF BIGDATA 1447476, and a hardware donation from NVIDIA.