Relativistic GAN

Screenshot from 2018-06-30 11-04-05.png

I just released my latest paper called The relativistic discriminator: a key element missing from standard GAN (code to implement relativistic GANs). On December 2018, it was accepted for publication at ICLR 2019 😸! Although filled with silly pictures, this work is no joke and I believe it is an important step forward for GANs. In this paper, I argue that standard GAN (SGAN) is missing a fundamental property, i.e., training the generator should not only increase the probability that fake data is real but also decrease the probability that real data is real. This property is fundamental and should have been in the very first GAN.

Giving this property to the discriminator makes SGAN relativistic. As I show in the paper, the discriminator of any GAN loss can be made to be relativistic. Interestingly, IPM-based GANs (WGAN, WGAN-GP, etc) already have a relativistic discriminator! This explains in part why these approaches are generally much more stable than standard GAN.

Having a relativistic discriminator can make any GAN very stable! I was able to train relativistic SGAN and Least squares GAN (LSGAN) on a small sample of N=2011 with 256×256 pictures, which is something that SGAN, LSGAN cannot even do (they get stuck at generating noise) and that Spectral GAN and WGAN-GP do poorly. Relativism not only improve stability but also overall quality. With relativistic SGAN and LSGAN, only one discriminator update per generator update is needed to reach the state-of-the-art, thus you can get equivalent or better results than WGAN-GP in a fraction of the time.

This shows the difference between divergence minimization, SGAN training and relativistic SGAN training:


There are two variants of the approach and they are very easy to apply:

Standard GAN (SGAN) discriminator

Relativistic standard GAN (RSGAN) discriminator

Relativistic average Standard GAN (RaSGAN) discriminator

This assumes a sigmoid activation (thus Standard GAN), but relativism can be used with any activation function and thus with any GAN. It works amazingly well with any GAN.

See my results when generating pictures of cats below:

64×64 cats with Relativistic average LSGAN (FID = 11.97)


128×128 cats with Relativistic average LSGAN (FID = 15.85)


256×256 cats with SGAN (5k iterations)


256×256 cats with LSGAN (5k iterations)


256×256 cats with Relativistic average SGAN (FID = 32.11)


256×256 cats with Relativistic average LSGAN (FID = 35.21)


256×256 cats with SGAN and spectral normalization (FID = 54.73)


256×256 cats with WGAN-GP (FID > 100)