GANs beyond divergence minimization

I just released my paper called GANs beyond divergence minimization in which I go over the details of how Generative adversarial networks (GANs) work and I show that GANs do not actually directly minimize a divergence. This is true for both saturating and non-saturating GANs. Therefore, saying that the generator minimizes the divergence is technically incorrect.

The paper provides a very nice and mathematically detailed introduction to GANs and how they actually work.


This means that the divergence estimated by the discriminator may just act as a proxy for something meaningful that the generator should emulate. In the end, the generator can learn to generate realistic data in many different ways, as long as the discriminator measure something meaningful and the generator tries to emulate something proportionally related to that measure.

I wrote this paper a month before releasing Relativistic GANs, therefore, I did not consider Relativistic GANs in the paper. It was submitted to NIPS2018. To prevent any possible bias in the review process, I did not release the paper until the end of the reviews. It was ultimately rejected by NIPS2018 due to a lack of extensive experiments (I only looked at CIFAR-10 and a toy example). Note that it is very time-consuming for me to run many analyses considering I run everything on my computer with a single GPU. I may consider redoing more experiments in the future, but for now, I have other projects I would rather focus on. However, I added the alternative generator loss functions here so you can try them if you want.