[PYTHON] Use TPU and Keras with Google Colaboratory

What is this?

This article uses TPU in Google Colaboratory. Unlike the GPU that works just by switching the runtime, there were some points to add to the code, so I will write it as a memorandum.

environment

Google Colaboratory You have tensorflow 1.15.0 installed.

TPUtest.py


import tensorflow as tf
import distuitls

print(distutils.version.LooseVersion(tf.__version__))
#>>1.15.0

verification code

Classify mnist using CNN. According to Google, TPU is not optimized for CNN, so it may be a slight disadvantage to TPU. However, I don't want to strictly evaluate the performance, so it's okay.

Data preparation and processing

TPUtest.py


from tensorflow.keras.datasets import mnist
from tensorflow.keras.utils import to_categorical
import numpy as np

#Data download
(X_train, y_train), (X_test, y_test) = mnist.load_data()

#Divide by 255
X_train = X_train/255
X_test = X_test/255

#Change the shape of image data
X_train = X_train.reshape(-1,28,28,1).astype(np.float32)
X_test = X_test.reshape(-1,28,28,1).astype(np.float32)

#Correct label one-Convert to hot
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)

Model building and compiling

TPUtest.py


from tensorflow.keras.layers import Conv2D, Dense, ReLU, Flatten, Input, MaxPool2D, Dropout
from tensorflow.keras.models import Sequential
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.utils import plot_model

def getModel():
  model = Sequential()
  model.add(Conv2D(3,3,input_shape=(28,28,1)))
  model.add(MaxPool2D(2))
  model.add(ReLU())
  model.add(Dropout(0.2))
  model.add(Flatten())
  model.add(Dense(1024))
  model.add(ReLU())
  model.add(Dense(10, activation="softmax"))
  return model

model = getModel()

#drawing
plot_model(model, show_shapes=True, show_layer_names=False)

#compile
model.compile(Adam(), loss="categorical_crossentropy", metrics=["acc"])

model.png

It is a very ordinary model.

Training

TPUtest.py


%%time
model.fit(X_train, y_train, epochs=10, validation_data=(X_test,y_test))

Forecast

TPUtest.py


%%time
y_pred = model.predict(X_test)

Display of forecast results

TPUtest.py


from sklearn.metrics import accuracy_score
import numpy as np

#one-Undo hot vector
y_pred = np.argmax(y_pred, axis=1)
y_test = np.argmax(y_test, axis=1)
print(accuracy_score(y_pred, y_test))
#>>0.9854

Run this code at three different runtimes and compare.

Execution result

The execution result is as follows.

runtime Training time Estimated time Predicted score
CPU 37s/epoch 1.49s 0.9854
GPU 13s/epoch 0.54s 0.9859
TPU 37s/epoch 2.76s 0.9863

... TPU isn't working?

Make the TPU work

First check the device

TPUtest.py


import os
import tensorflow as tf
import pprint

if 'COLAB_TPU_ADDR' not in os.environ:
  print('ERROR: Not connected to a TPU runtime; please see the first cell in this notebook for instructions!')
else:
  tpu_address = 'grpc://' + os.environ['COLAB_TPU_ADDR']
  print ('TPU address is', tpu_address)

  with tf.Session(tpu_address) as session:
    devices = session.list_devices()
    
  print('TPU devices:')
  pprint.pprint(devices)

It is OK if it is displayed in a row.

Compile for TPU

A little ingenuity is required when creating and compiling the model.

TPUtest.py


def getModel():
  #Up to this point, it is omitted because it is the same as GPU
  return model

#Various setups of TPU
resolver = tf.contrib.cluster_resolver.TPUClusterResolver('grpc://' + os.environ['COLAB_TPU_ADDR'])
tf.contrib.distribute.initialize_tpu_system(resolver)
strategy = tf.contrib.distribute.TPUStrategy(resolver)


with strategy.scope():#I need to add this
  model = getModel()
  model.compile(Adam(), loss="categorical_crossentropy", metrics=["acc"])

#The rest fits normally
model.fit(X_train, y_train, epochs=10, validation_data=(X_test,y_test))

Learning progresses clearly more comfortably than when using a CPU. Let's try the prediction as it is.

TPUtest.py


y_pred = model.predict(X_test)

e? It works, but it's not too slow ...? Isn't the prediction over ?? The execution result is as follows.

runtime Training time Estimated time Predicted score
CPU(Repost) 37s/epoch 1.49s 0.9854
GPU(Repost) 13s/epoch 0.54s 0.9859
TPU 17.7s/epoch 15min 15s 0.9853

Make predictions at a decent speed

I found that making predictions with TPU would be ridiculous. Why is validation going at (yet) common sense speed ...?

Since it is unavoidable, learning is done on the TPU and then prediction is done on the CPU.

TPUtest.py


#Learning with TPU
model.fit(X_train, y_train, epochs=10, validation_data=(X_test,y_test))
model.save_weights("./weight.h5")#Save weights to file

#CPU prediction
cpu_model = getModel()#Build a model with CPU
cpu_model.load_weights("./weight.h5")#Load the saved weight
y_pred = cpu_model.predict(X_test)# cpu_Predicted by model

The final performance is as follows.

runtime Training time Estimated time Predicted score
CPU 37s/epoch 1.49s 0.9854
GPU 13s/epoch 0.54s 0.9859
TPU 17.7s/epoch 1.22s(Use CPU) 0.9853

To be honest, I feel that GPU is easier, but as I mentioned earlier, CNN seems to be a weak field of TPU, for example, LSTM has more than double the learning speed of GPU, so it may be possible to use it properly depending on the situation. .. You can simply run two runtimes at the same time.

I was addicted to

I ran into an error reasonably ...

InvalidArgumentError Part 1

Error message

InvalidArgumentError: Cannot assign a device for operation conv2d_1/kernel/IsInitialized/VarIsInitializedOp: node conv2d_1/kernel/IsInitialized/VarIsInitializedOp (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1748)  was explicitly assigned to /job:worker/replica:0/task:0/device:TPU:0 but available devices are [ /job:localhost/replica:0/task:0/device:CPU:0, /job:localhost/replica:0/task:0/device:XLA_CPU:0 ]. Make sure the device specification refers to a valid device.
	 [[conv2d_1/kernel/IsInitialized/VarIsInitializedOp]]

I got an error when I created a model using keras. It worked by using tensorflow.keras instead of keras. It's a trap, isn't it?

InvalidArgumentError Part 2

Error message

InvalidArgumentError: Unsupported data type for TPU: double, caused by output IteratorGetNext:0

It seems that TPU does not support Double type, so convert it to np.float32 before training.

InvalidArgumentError Part 3

Error message

InvalidArgumentError: No OpKernel was registered to support Op 'TPUReplicatedInput' used by node input0_1 (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1748) with these attrs: [T=DT_INT32, N=8]
Registered devices: [CPU, XLA_CPU]
Registered kernels:
  <no registered kernels>

	 [[input0_1]]

An error that happened only once by chance. I restarted it for the time being and it worked. I don't want to verify the reproducibility, so I don't.

InvalidArgumentError Part 4

Error message

InvalidArgumentError: Cannot assign a device for operation lstm_1/random_uniform/RandomUniform: node lstm_1/random_uniform/RandomUniform (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1748)  was explicitly assigned to /job:worker/replica:0/task:0/device:TPU:0 but available devices are [ /job:localhost/replica:0/task:0/device:CPU:0, /job:localhost/replica:0/task:0/device:XLA_CPU:0 ]. Make sure the device specification refers to a valid device.
	 [[lstm_1/random_uniform/RandomUniform]]

I can't access the TPU ...? Reboot for the time being.

InternalError Error message

InternalError: Failed to serialize message

It occurred when I had the LSTM read a large amount of data. It worked when I reduced the amount. Is it a memory error? (It's a mystery because it works if you pass the same amount of data to the GPU runtime. Well, it didn't make sense because the processing time is too long even on the GPU and it doesn't end in 12h)

KeyError Error message

KeyError: 'COLAB_TPU_ADDR'

Occurs when the runtime is not TPU. Please switch to TPU and execute.

References

https://colab.research.google.com/github/tensorflow/tpu/blob/master/tools/colab/fashion_mnist.ipynb#scrollTo=2a5cGsSTEBQD

Recommended Posts

Use TPU and Keras with Google Colaboratory
Run Keras on Google Colaboratory TPU
Use MeCab and neologd with Google Colab
Use music21 on Google Colaboratory
Study Python with Google Colaboratory
Try OpenCV with Google Colaboratory
Building an environment to use CaboCha with google colaboratory
How to use Google Colaboratory
Use "% tensorflow_version 2.x" when using TPU with Tensorflow 2.1.0 in Colaboratory
Reinforcement learning 23 Create and use your own module with Colaboratory
■ [Google Colaboratory] Use morphological analysis (janome)
Compare DCGAN and pix2pix with keras
OpenCV feature detection with Google Colaboratory
Google colaboratory
How to use Google Colaboratory and usage example (PyTorch x DCGAN)
The strongest way to use MeCab and CaboCha with Google Colab
Use The Metabolic Disassembler on Google Colaboratory
Use Jupyter Lab and Jupyter Notebook with EC2
Compare raw TensorFlow with tf.contrib.learn and Keras
Execute Google Translate and DeepL Translate with GUI
Use PIL and Pillow with Cygwin Python
How to search Google Drive with Google Colaboratory
Use cartopy without bugs in Google Colaboratory
Classified ImageNet burgers and bicycles with Keras
Easy! Use gensim and word2vec with MAMP.
Use Python and MeCab with Azure Functions
Learn with Shogi AI Deep Learning on Mac and Google Colab Use Google Colab
Use dein.vim and ckw-mod with Windows7 32bit PowerShell
Try running Google Chrome with Python and Selenium
Display Google Maps API with Rails and pin display
How to use Service Account OAuth and API with Google API Client for python
Google Colaboratory 90-minute session disconnection countermeasures --- Use Python! ---
Use Python and word2vec (learned) with Azure Databricks
I tried simple image processing with Google Colaboratory.
"Learning word2vec" and "Visualization with Tensorboard" on Colaboratory
Cheat sheet when scraping with Google Colaboratory (Colab)
Deep learning image analysis starting with Kaggle and Keras
Use the Cognitive Took Kit (CNTK) with the Keras backend
How to load files in Google Drive with Google Colaboratory
End-to-End single channel sound source separation with Google Colaboratory
Book registration easily with Google Books API and Rails
Use of Google Cloud Storage (GCS) with "GAE / Py"
How to analyze with Google Colaboratory using Kaggle API
I tried to implement Grad-CAM with keras and tensorflow
Learn Wasserstein GAN with Keras model and TensorFlow optimization
Easy learning of 100 language processing knock 2020 with "Google Colaboratory"
How to use Spacy Japanese model in Google Colaboratory
I can't use the darknet command in Google Colaboratory!
Install tweepy with pip and use it for API 1.1
Authenticate Google with Django
Use RTX 3090 with PyTorch
Use ansible with cygwin
Use pipdeptree with virtualenv
[Python] Use JSON with Python
Use Mock with pytest
Image recognition with keras
Is it Google Colaboratory?
Use Gentelella with django
optuna, keras and titanic
Use tensorboard with Chainer
Use DynamoDB with Python