I tried running GAN in Colaboratory

Introduction

In the Jupyter Notebook environment called Colaboratory, which I had been interested in for a long time, I just tried running GAN, which I had been interested in for a long time.

Regarding Colavoratory, the article [Use free GPU at speed per second] Deep Learning Practice Tips on Colaboratory was helpful.

For GAN, I took a quick look at Thesis Generative Adversarial Networks. GAN is a kind of method to approximate the probability distribution of the data at hand (considered as a uniform distribution). When the two networks G and D are trained nicely, the probability distribution of the data generated by G is at hand. It seems to match the probability distribution of the data. It may not be possible to study in a good way, so we are currently researching ways to do so.

Run

For the GAN code, I referred to here. GAN was simply coded using keras and I learned a lot.

We define two MLPs (this is G and D), give the output of G to D, and let Adam learn alternately. D learns to distinguish between "data at hand" and "output of G". G trains by manipulating the teacher data so that the discrimination result of D becomes "data at hand". At this time, D is not trained. The training data is MNIST.

If you think there is a ReLU that you are not familiar with, it seems that it is called Leaky ReLU, which is often used these days. (Reference: About the activation function ReLU and the ReLU clan [additional information]) Unlike ReLU, even if the input x of the activation function is 0 or less, x * α The value of is output. It is said that it is not effective for the wiki, but is it possible to reduce the gradient disappearance as much as possible? I'm not sure.

The code ran without any problems, but it doesn't return history because I'm training by running train_on_batch inside my own loop instead of fit. I want to visualize loss and acc, so I'll add code to save it as an instance variable and code for visualization.

 # save
      self.all_d_loss_real.append(d_loss_real)
      self.all_d_loss_fake.append(d_loss_fake)
      self.all_g_loss.append(g_loss)
      
      if epoch % sample_interval == 0:
        self.sample_images(epoch)
        np.save('d_loss_real.npy', self.all_d_loss_real)
        np.save('d_loss_fake.npy', self.all_d_loss_fake)
        np.save('g_loss.npy', self.all_g_loss)

real is the D loss of the data at hand, and fake is the D loss of the data generated by G. Code to save locally.

from google.colab import files
import os

file_list = os.listdir("images")

for file in file_list:
    files.download("images"+os.sep+file)

files.download('d_loss_real.npy')
files.download('d_loss_fake.npy')
files.download('g_loss.npy')

It is a code to plot loss and so on.

import numpy as np
import pylab as plt


t1 = np.load('d_loss_real.npy')
t2 = np.reshape(np.load('d_loss_fake.npy'),[np.shape(t1)[0],2])
g_loss = np.load('g_loss.npy')

t = (t1+t2)/2
d_loss = t[:,0]
acc = t[:,1]
d_loss_real = t1[:,0]
d_loss_fake = t2[:,0]
acc_real = t1[:,1]
acc_fake = t2[:,1]


n_epoch = 29801

x = np.linspace(1,n_epoch,n_epoch)
plt.plot(x, acc, label='acc')
plt.plot(x, d_loss, label='d_loss')
plt.plot(x, g_loss, label='g_loss')
plt.plot(x, d_loss_real, label='d_loss_real')
plt.plot(x, d_loss_fake, label='d_loss_fake')
plt.legend()
plt.ylim([0, 2])
plt.grid()
plt.show()

#moving average
num=100#Number of moving averages
b=np.ones(num)/num
acc2=np.convolve(acc, b, mode='same')
d_loss2=np.convolve(d_loss, b, mode='same')
d_loss_real2=np.convolve(d_loss_real, b, mode='same')
d_loss_fake2=np.convolve(d_loss_fake, b, mode='same')
g_loss2=np.convolve(g_loss, b, mode='same')

x = np.linspace(1,n_epoch,n_epoch)
plt.plot(x, acc2, label='acc')
plt.plot(x, d_loss2, label='d_loss')
plt.plot(x, g_loss2, label='g_loss')
plt.plot(x, d_loss_real2, label='d_loss_real')
plt.plot(x, d_loss_fake2, label='d_loss_fake')
plt.legend()
plt.ylim([0,1.2])
plt.grid()
plt.show()

result

Generated image of G epoch=0 0.png

epoch=200 200.png

epoch=1000 1000.png

epoch=3000 3000.png

epoch=7000 6600.png

epoch=10000 9800.png

epoch=20000 20000.png

epoch=30000 29800.png

As the number of epochs increases, images similar to MNIST will be generated, but it seems that there will be no particular change from around epoch 7000.

Correct answer rate and loss t.png

Moving average in the above figure (n = 100, padded with zeros at both ends) t2.png

From about epoch 7,000, acc 0.63, d_loss (also real and fake) 0.63, g_loss 1.02 ~ 1.08 (slight increase) (d_loss and g_loss are binary cross entropy). real is the D loss of the data at hand, fake is the D loss of the data generated by G, and d_loss is the average.

loss is defined as the following formula.

\textrm{loss} = -\frac{1}{N}\sum_{n=1}^{N}\bigl( y_n\log{p_n}+(1-y_n)\log{(1-p_n)}\bigr)

N is the number of data, y is the label, and p is the output value of D (0,1).

It's confusing because it contains $ \ log $, but what we're doing is the average D output $ \ bigl (\ prod_ {n = 1} ^ {N} p_n ^ {y_n} \ bigr) ^ {\ I changed frac {1} {N}} $ to $ \ log $, and $ \ log $ from 0 to 1 is a negative number, so it's hard to see, so I just added a minus to make it a positive number.

\begin{align}
\textrm{loss} &= -\frac{1}{N}\sum_{n=1}^{N}\bigl( y_n\log{p_n}+(1-y_n)\log{(1-p_n)}\bigr) \\
&= -\log{\bigl( \prod_{n=1}^{N}p_n^{y_n}\bigr)^{\frac{1}{N}}} -\log{\bigl( \prod_{n=1}^{N}(1-p_n)^{y_n}\bigr)^{\frac{1}{N}}}
\end{align}

◯ Loss around epoch 25000

loss Average D output
g_loss 1.06 0.35
d_loss 0.63 0.53
d_loss_real 0.63 0.53
d_loss_fake 0.63 0.47

Consideration

The more the label matches the output, the smaller the loss. GAN does not aim to reduce loss, so it is not a problem that it does not decrease.

If the learning goes well and the data at hand and the data generated by G are completely indistinguishable (the purpose of GAN is to be in this state), acc = 0.5 should be, but as far as the result is seen, it is so. It is not.

Looking at the image generated by G, it seems that it is clearly not a handwritten number, which is probably the reason why acc is high. It may be a little better if you play with the parameters, but since the purpose is not to drive in, I will stop here for the time being.

The value of g_loss means that the lower the value of g_loss, the more D determines that the image generated by G is true-that is, the more D is deceived. On the contrary, the higher the value of g_loss, the more D is not deceived. If the goal is to have an average D output of g_loss of 0.5, then g_loss is 0.7, so I would like it to drop a little further.

I don't think it happens that acc matches d_loss.

From epoch7000 ~, it is worrisome that the amount of decrease in d_loss_fake is smaller than the amount of increase in g_loss. Even with the average D output, there is a difference of about 10 times. Since the order is D learning → G learning, is that effective for Moro?

At the end

I felt like I was able to do it. I don't think there's anything particularly clogged up, but because the colaboratory isn't very stable, if you think the calculation crashes in the middle or the screen is reloaded, the past notebook is displayed for some reason, and you don't notice it. I overwrote it and rewrote the code sobbing.

2.png

Be careful if this pop-up appears from below after reloading the screen. If you look closely at the code, it is the unedited code immediately after opening the Colaboratory, and when you save it, the edited code is overwritten.

As a countermeasure, I think you should reload the page. My browser is Safari, but when I press Ctrl-r to reload the page, the edited code is displayed, and the variables after execution are also kept. If this pop-up appears, I think it's safer not to rush to overwrite it.

I think you have to make regular backups of calculation crashes.

Recommended Posts

I tried running GAN in Colaboratory
I tried running pymc
I tried running TensorFlow
I tried running TensorFlow in AWS Lambda environment: Preparation
I tried Grumpy (Go running Python).
I tried to implement Realness GAN
I tried running prolog with python 3.8.2.
I tried Line notification in Python
I tried to implement PLSA in Python
I tried to implement permutation in Python
I tried to implement PLSA in Python 2
I tried using Bayesian Optimization in Python
I tried putting virtualenv in Cygwin environment
I tried to implement ADALINE in Python
I tried to implement PPO in Python
I tried scraping
I tried PyQ
I tried AutoKeras
I tried papermill
I tried django-slack
I tried Django
I tried spleeter
I tried cgo
I tried to move GAN (mnist) with keras
I tried playing a typing game in Python
I tried to integrate with Keras in TFv1.1
I tried simulating the "birthday paradox" in Python
I tried the least squares method in Python
I tried running YOLO v3 on Google Colab
I tried to implement TOPIC MODEL in Python
I tried Python's "*" character output in another language
I tried running faiss with python, Go, Rust
I tried non-blocking I / O Eventlet behavior in Python
I tried running python -m summpy.server -h 127.0.0.1 -p 8080
I tried adding a Python3 module in C
I tried running Deep Floor Plan with Python 3.6.10.
I tried running alembic, a Python migration tool
I tried to implement selection sort in python
I tried simple image processing with Google Colaboratory.
I tried using parameterized
Draw a graph in Julia ... I tried a little analysis
I tried using argparse
I tried to graph the packages installed in Python
I tried using mimesis
I tried using anytree
I tried competitive programming
I tried ARP spoofing
I tried using google test and CMake in C
I tried running the app on the IoT platform "Rimotte"
I tried using aiomysql
I tried using TradeWave (BitCoin system trading in Python)
I tried using Summpy
I tried Python> autopep8
I tried running python etc. from a bat file
I tried using coturn
I tried using Pipenv
I tried using matplotlib
I tried using "Anvil".
I tried using Hubot
I tried to implement Dragon Quest poker in Python
I tried PyCaret2.0 (pycaret-nightly)