Generative Adversarial Networks in Detail

Understand the basic concepts behind GANs and their training process.

We'll cover the following

Another way to explain GANs is through the probabilistic formulation we used on variational autoencoders.

GANs follow a different approach in finding the probability distribution of the data pdata(x)p_{data}(x).

Instead of computing or the approximate pdata(x)p_{data}(x), we only care about the ability to sample data from the distribution.

But what does that actually mean?

If we assume that our data xix_i follow a probability distribution pdata(x)p_{data}(x), we will want to build a model that allows us to draw samples from pdata(x)p_{data}(x).

As we did with VAE, we again introduce a latent variable zz with a prior distribution p(z)p(z). p(z)p(z) is usually a simple random distribution such as the uniform or a Gaussian (normal distribution).

We then sample zz from p(z)p(z) and pass the sample to the generator network G(z)G(z), which will output a sample of data xx with x=G(z)x=G(z).

xx can be thought of as a sample from a third distribution, the generator’s distribution pGp_G. The generator will be trained to convert random zz into fake data xx or, in other words, to force pGp_G to be as close as possible to pdata(x)p_{data}(x).

This is where the discriminator network D comes into play. The discriminator is simply a classifier that produces a single probability, wherein 0 corresponds to a fake generated xx and 1 to the real sample from our distribution.

These two networks are trained using this minimax game. Let’s take a closer look.

Training

One key insight is the indirect training: this basically means that the generator is not trained to minimize the distance to a specific image, but just to fool the discriminator!

The loss that occurs in this training is called adversarial loss.

The adversarial loss enables the model to learn in an unsupervised manner.

When we train D, real images are labeled as 1 and fake generated images as 0. On the other hand, the ground truth label, when training the generator, is 1 for fake images (like a real image), even though the examples are fake.

This happens because our objective is just to fool D. The image below illustrates this process:

Get hands-on with 1400+ tech skills courses.