Constitution

OS: Windows10 Home CPU: Intel Core i7-4790 GPU: GTX1660 Super (any NVIDIA GPU) Python: 3.6.10 Keras: 2.2.4 Tensorflow: 1.14.0 Cuda: 10.0 numpy: 1.16.4 sklearn: 0.22.2

1. Install Windows Visual Studio C ++

Installation page https://docs.microsoft.com/ja-jp/visualstudio/install/install-visual-studio?view=vs-2019 From here, install Visual Studio. This time, install the 2019 version. Be sure to select "Desktop development with C ++ workloads" during installation.

2. Install NVIDIA Driver

https://www.nvidia.co.jp/Download/index.aspx?lang=jp Select your GPU from this URL and download the NVIDIA driver. Run the downloaded exe file to install it.

Postscript. Download and install CUDA toolkit

https://developer.nvidia.com/cuda-10.0-download-archive

3. Download cuDNN

https://developer.nvidia.com/rdp/cudnn-download Download "cuDNN v7.6.5 for CUDA 10.0" from this site Then open "C: \ Program Files \ NVIDIA GPU Computing Toolkit \ CUDA \ v.10.0" in the folder. There are three folders, bin, include, and lib, in the unzipped folder. Save the files in bin in the bin folder, the files in include in the include folder, and the files in lib in the lib folder.

4. Install tensorflow-gpu

First, if you have regular tensorflow installed, Must be uninstalled and then installed.

pip uninstall tensorflow
pip install numpy==1.16.4
pip install tensorflow-gpu==1.14.0
pip install keras==2.2.4
pip install sklearn

Install the ones that are likely to be used for graph drawing and data processing. Below you don't have to install it.

pip install matplotlib
pip install pandas
pip install pillow
pip install opencv-python
pip install opencv-contrib-python

5. Check if the GPU version of Tensorflow recognizes it

from tensorflow.python.client import device_lib
device_lib.list_local_devices()

I think a long sentence will be output, It is OK if there are characters such as device_type:" GPU "`` `and` `name:" / device: GPU: 0 " in one line.

6. After all, the test is MNIST.

MNIST is the test. Speed verification with MNIST.

import time
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.optimizers import RMSprop

(x_train, y_train), (x_test, y_test) = mnist.load_data()

model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=(28, 28, 1)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(10, activation='softmax'))

x_train = x_train.reshape(x_train.shape[0], 28, 28, 1)
x_test = x_test.reshape(x_test.shape[0], 28, 28, 1)

x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255

y_train = keras.utils.to_categorical(y_train, 10)
y_test = keras.utils.to_categorical(y_test, 10)

start_time = time.time()#Start time measurement

model.compile(loss='categorical_crossentropy',
              optimizer=RMSprop(),
              metrics=['accuracy'])

history = model.fit(x_train, y_train,
                    batch_size=128,
                    epochs=10,
                    verbose=1,
                    validation_data=(x_test, y_test))

score = model.evaluate(x_test, y_test, verbose=0)
print('loss:', score[0])
print('accuracy:', score[1])

end_time = time.time() - start_time#Measure the end time
print("Study time:", str(round(end_time, 3)), "It was seconds.")

Output result on GPU version: 48.13 seconds Output result on CPU version: 530.26 seconds

The speed was different by more than 10 times! !!

At the end

I personally wanted to do deep learning, so I rushed into an old PC and set up an AI machine. Also, if I have enough money, I would like to use a GPU with more memory or two stabs. see you.

[PYTHON] [Windows] Memo to use Keras on GPU [Tensorflow-GPU]