This article is the previous article "The story that the dataset function that can be used with TensorFlow was strong" "[[TF2.0 application] tf.data. It is a story of a further enhanced version of Data Augmentation that was raised a little in "Make Data Augmentation faster with Dataset" (https://qiita.com/Suguru_Toyohara/items/49c2914b21615b554afa).
While enhancing the speed with tf.data.Dataset
and using the keras.preprocessing.image
system
** We have succeeded in realizing code that can be processed in parallel. ** **
I will put the actual mechanism and the background to this point next to the code.
I'll put the code below
First, let's prepare the experimental environment.
init
import tensorflow as tf
import tensorflow.keras as keras
import matplotlib.pyplot as plt
import sklearn
import numpy as np
from tqdm import tqdm
(tr_x,tr_y),(te_x,te_y)=keras.datasets.cifar10.load_data()
tr_x, te_x = tr_x/255.0, te_x/255.0
tr_y, te_y = tr_y.reshape(-1,1), te_y.reshape(-1,1)
model = keras.models.Sequential()
model.add(keras.layers.Convolution2D(32,3,padding="same",activation="relu",input_shape=(32,32,3)))
model.add(keras.layers.Convolution2D(32,3,padding="same",activation="relu"))
model.add(keras.layers.Convolution2D(32,3,padding="same",activation="relu"))
model.add(keras.layers.MaxPooling2D())
model.add(keras.layers.Convolution2D(128,3,padding="same",activation="relu"))
model.add(keras.layers.Convolution2D(128,3,padding="same",activation="relu"))
model.add(keras.layers.Convolution2D(128,3,padding="same",activation="relu"))
model.add(keras.layers.Convolution2D(128,3,padding="same",activation="relu"))
model.add(keras.layers.Convolution2D(128,3,padding="same",activation="relu"))
model.add(keras.layers.MaxPooling2D())
model.add(keras.layers.Convolution2D(256,3,padding="same",activation="relu"))
model.add(keras.layers.Convolution2D(256,3,padding="same",activation="relu"))
model.add(keras.layers.Convolution2D(256,3,padding="same",activation="relu"))
model.add(keras.layers.GlobalAveragePooling2D())
model.add(keras.layers.Dense(1000,activation="relu"))
model.add(keras.layers.Dense(128,activation="relu"))
model.add(keras.layers.Dense(10,activation="softmax"))
model.compile(loss="sparse_categorical_crossentropy",metrics=["accuracy"])
First, let's express keras.preprocessing.image.random_rotate
so that it can be done with .map
.
random_rotate
from tensorflow.keras.preprocessing.image import random_rotation
from joblib import Parallel, delayed
def r_rotate(imgs, degree):
pics=imgs.numpy()
degree = degree.numpy()
if tf.rank(imgs)==4:
X=Parallel(n_jobs=-1)( [delayed(random_rotation)(pic, degree, 0, 1, 2) for pic in pics] )
X=np.asarray(X)
elif tf.rank(imgs)==3:
X=random_rotation(pics, degree, 0, 1, 2)
return X
@tf.function
def random_rotate(imgs, label):
x = tf.py_function(r_rotate,[imgs,30],[tf.float32])
X = x[0]
X.set_shape(imgs.shape)
return X, label
Now it actually works. Let's move it and see the data.
View data
labels = np.array([
'airplane',
'automobile',
'bird',
'cat',
'deer',
'dog',
'frog',
'horse',
'ship',
'truck'])
tr_ds = tf.data.Dataset.from_tensor_slices((tr_x,tr_y)).shuffle(40000).batch(128).map(random_rotate)
plt.figure(figsize=(10,10),facecolor="white")
for b_img,b_label in tr_ds:
for i, img,label in zip(range(25),b_img,b_label):
plt.subplot(5,5,i+1)
plt.xticks([])
plt.yticks([])
plt.grid(False)
plt.imshow(img)
plt.xlabel(labels[label])
break
plt.show()
Let's check how fast it actually will be. First of all, the speed in "[TF2.0 application] tf.data.Dataset to speed up Data Augmentation" was as follows.
result
Train on 50000 samples
50000/50000 [==============================] - 9s 175us/sample - loss: 2.3420 - accuracy: 0.1197
Train on 50000 samples
50000/50000 [==============================] - 7s 131us/sample - loss: 2.0576 - accuracy: 0.2349
Train on 50000 samples
50000/50000 [==============================] - 7s 132us/sample - loss: 1.7687 - accuracy: 0.3435
Train on 50000 samples
50000/50000 [==============================] - 7s 132us/sample - loss: 1.5947 - accuracy: 0.4103
Train on 50000 samples
50000/50000 [==============================] - 7s 132us/sample - loss: 1.4540 - accuracy: 0.4705
CPU times: user 1min 33s, sys: 8.03 s, total: 1min 41s
Wall time: 1min 14s
Next, I'll post the code and results from the previous implementation.
dataset
%%time
tr_ds = tf.data.Dataset.from_tensor_slices((tr_x,tr_y)).shuffle(40000)
tr_ds = tr_ds.batch(tr_x.shape[0]).map(random_rotate).repeat(5)
tr_ds = tr_ds.prefetch(tf.data.experimental.AUTOTUNE)
for img,label in tr_ds:
model.fit(x=img,y=label,batch_size=128)
result
Train on 50000 samples
50000/50000 [==============================] - 9s 176us/sample - loss: 1.3960 - accuracy: 0.5021
Train on 50000 samples
50000/50000 [==============================] - 9s 173us/sample - loss: 1.2899 - accuracy: 0.5430
Train on 50000 samples
50000/50000 [==============================] - 9s 175us/sample - loss: 1.2082 - accuracy: 0.5750
Train on 50000 samples
50000/50000 [==============================] - 9s 171us/sample - loss: 1.1050 - accuracy: 0.6133
Train on 50000 samples
50000/50000 [==============================] - 7s 132us/sample - loss: 1.0326 - accuracy: 0.6405
CPU times: user 52 s, sys: 15.4 s, total: 1min 7s
Wall time: 48.7 s
Is it a feeling that the preprocessing Map is working on the CPU while the GPU is running at up to 90%?
Since it is 48.7 seconds in total, it can be shortened by about 25 seconds.
Also, the time without Map was 35.1 seconds, so you can see that Data Augmentation can be done fairly quickly.
And if you do it in the same way, you can ** all the keras.preprocessing.image
system. ** **
show_data
def show_data(tf_dataset):
for b_img,b_label in tf_dataset:
for i, img,label in zip(range(25),b_img,b_label):
plt.subplot(5,5,i+1)
plt.xticks([])
plt.yticks([])
plt.grid(False)
plt.imshow(img)
plt.xlabel(labels[label])
break
plt.show()
random_shift
You can specify up to what percentage of the shift at random.
random_shift
from tensorflow.keras.preprocessing.image import random_shift
from joblib import Parallel, delayed
def r_shift(imgs,wrg,hrg):
pics=imgs.numpy()
w = wrg.numpy()
h = wrg.numpy()
if tf.rank(imgs)==4:
X=Parallel(n_jobs=-1)( [delayed(random_shift)(pic,w,h,0,1,2) for pic in pics] )
X=np.asarray(X)
elif tf.rank(imgs)==3:
X=random_shift(pics, w,h, 0, 1, 2)
return X
@tf.function
def tf_random_shift(imgs, label):
x = tf.py_function(r_shift,[imgs,0.3,0.3],[tf.float32])
X = x[0]
X.set_shape(imgs.shape)
return X, label
Data visualization
tr_ds = tf.data.Dataset.from_tensor_slices((tr_x,tr_y)).shuffle(40000).batch(128).map(tf_random_shift)
plt.figure(figsize=(10,10),facecolor="white")
show_data(tr_ds)
random_shear
It can be distorted. (I don't know the details)
random_shear
from tensorflow.keras.preprocessing.image import random_shear
def r_shear(imgs,degree):
pics=imgs.numpy()
degree = degree.numpy()
if tf.rank(imgs)==4:
X=Parallel(n_jobs=-1)( [delayed(random_shear)(pic,degree,0,1,2) for pic in pics] )
X=np.asarray(X)
elif tf.rank(imgs)==3:
X=random_shear(pics,degree,0,1,2)
return X
@tf.function
def tf_random_shear(imgs, label):
x = tf.py_function(r_shear,[imgs,30],[tf.float32])
X = x[0]
X.set_shape(imgs.shape)
return X, label
Data confirmation
tr_ds = tf.data.Dataset.from_tensor_slices((tr_x,tr_y)).shuffle(40000).batch(128).map(tf_random_shear)
plt.figure(figsize=(10,10),facecolor="white")
show_data(tr_ds)
random_zoom
Randomly zoom.
random_zoom
from tensorflow.keras.preprocessing.image import random_zoom
def r_zoom(imgs,range_w,range_h):
pics=imgs.numpy()
zoom_range = (range_w.numpy(),range_h.numpy())
if tf.rank(imgs)==4:
X=Parallel(n_jobs=-1)( [delayed(random_zoom)(pic,zoom_range,0,1,2) for pic in pics] )
X=np.asarray(X)
elif tf.rank(imgs)==3:
X=random_zoom(pics,zoom_range,0,1,2)
return X
@tf.function
def tf_random_zoom(imgs, label):
x = tf.py_function(r_zoom,[imgs,0.5,0.5],[tf.float32])
X = x[0]
X.set_shape(imgs.shape)
return X, label
Output result
tr_ds = tf.data.Dataset.from_tensor_slices((tr_x,tr_y)).shuffle(40000).batch(128).map(tf_random_zoom)
plt.figure(figsize=(10,10),facecolor="white")
show_data(tr_ds)
It looks like the same size ... Let's improve it.
enhanced
from tensorflow.keras.preprocessing.image import random_zoom
import random
def zoom_range_gen(random_state):
while True:
x=random.uniform(random_state[0],random_state[1])
yield (x,x)
def r_zoom(imgs):
pics=imgs.numpy()
random_state = [0.5,1.5]
if tf.rank(imgs)==4:
X=Parallel(n_jobs=-1)( [delayed(random_zoom)(pic,(x,y),0,1,2) for pic,(x,y) in zip(pics,zoom_range_gen(random_state))])
X=np.asarray(X)
elif tf.rank(imgs)==3:
zoom_range=next(zoom_range_gen)
X=random_zoom(pics,zoom_range,0,1,2)
return X
@tf.function
def tf_random_zoom_enhanced(imgs, label):
x = tf.py_function(r_zoom,[imgs],[tf.float32])
X = x[0]
X.set_shape(imgs.shape)
return X, label
Let's check the data
Data confirmation
tr_ds = tf.data.Dataset.from_tensor_slices((tr_x,tr_y)).shuffle(40000).batch(128).map(tf_random_zoom_enhanced)
plt.figure(figsize=(10,10),facecolor="white")
show_data(tr_ds)
It feels good! !!
Next, implement the Augmentation in this blog "Data Augmentation Summary of Images in NumPy" Augmentation in Keras is Numpy based, so you can now implement Numpy based Augmentation.
I will quote the image of the cat from the blog "Data Augmentation summary of images in NumPy". I will quote the contents from this implementation. I will also write the source in the code.
random-flip
Let's implement random left-right reversal. This has already been implemented in the TF system, so we will use it.
random-flip
@tf.function
def flip_left_right(image,label):
return tf.image.random_flip_left_right(image),label
@tf.function
def flip_up_down(image,label):
return tf.image.random_flip_up_down(image),label
Data confirmation
tr_ds = tf.data.Dataset.from_tensor_slices((tr_x,tr_y)).shuffle(40000).batch(128)
tr_ds = tr_ds.map(flip_left_right).map(flip_up_down)
plt.figure(figsize=(10,10),facecolor="white")
show_data(tr_ds)
random-clip
Here, we will use Scale Augmentation in Blog.
For the implementation, I referred to the blog.
random-clip
from PIL import Image
def random_crop(pic, crop_size=(28, 28)):
try:
h, w, c = pic.shape
except ValueError:
raise ValueError("4Ds image can't decode")
#Determine the upper left point of the image in the specified section
top = np.random.randint(0, h - crop_size[0])
left = np.random.randint(0, w - crop_size[1])
#Determine the bottom right point to fit the size
bottom = top + crop_size[0]
right = left + crop_size[1]
#Cut out only the intersection of the upper left point and the lower right point
pic = pic[top:bottom, left:right, :]
return pic
def scale_augmentation(pic, scale_range=(38, 80), crop_size=32):
scale_size = np.random.randint(*scale_range)
Ppic = Image.fromarray(pic)
Ppic = Ppic.resize((scale_size,scale_size),resample=1)
pic = np.asarray(Ppic)
return random_crop(pic, (crop_size, crop_size))
def r_crop(imgs):
pics=imgs.numpy()
pics=np.asarray(pics * 255.0,dtype=np.uint8)
random_state = (38,60)
crop_size=32
if tf.rank(imgs)==4:
X=Parallel(n_jobs=-1)([delayed(scale_augmentation)(pic,random_state,crop_size) for pic in pics ])
X=np.asarray(X)
elif tf.rank(imgs)==3:
X=scale_augmentation(pics,random_state,crop_size)
X=X/255.0
return X
@tf.function
def tf_random_crop(imgs, label):
x = tf.py_function(r_crop,[imgs],[tf.float32])
X = x[0]
X.set_shape(imgs.shape)
return X, label
Data confirmation
tr_ds = tf.data.Dataset.from_tensor_slices((tr_x,tr_y)).shuffle(40000).batch(128)
tr_ds = tr_ds.map(tf_random_crop)
plt.figure(figsize=(10,10),facecolor="white")
show_data(tr_ds)
random-erasing
Implement this. For the implementation, I referred to Blog.
random_erasing
def random_erasing(pic, p=0.5, s=(0.02, 0.4), r=(0.3, 3)):
#Whether to mask or not
if np.random.rand() > p:
return pic
#Randomly determine the pixel value to be masked
mask_value = np.random.random()
try:
h, w, c = pic.shape
except ValueError:
raise ValueError("4Ds image can't decode")
#Mask size s of original image(0.02~0.4)Randomly decide from the double range
mask_area = np.random.randint(h * w * s[0], h * w * s[1])
#Mask aspect ratio r(0.3~3)Randomly decide from the range of
mask_aspect_ratio = np.random.rand() * r[1] + r[0]
#Determine the mask height and width from the mask size and aspect ratio
#Calculated height and width(Either)May be larger than the original image, so fix it
mask_height = int(np.sqrt(mask_area / mask_aspect_ratio))
if mask_height > h - 1:
mask_height = h - 1
mask_width = int(mask_aspect_ratio * mask_height)
if mask_width > w - 1:
mask_width = w - 1
top = np.random.randint(0, h - mask_height)
left = np.random.randint(0, w - mask_width)
bottom = top + mask_height
right = left + mask_width
pic[top:bottom, left:right, :].fill(mask_value)
return pic
def r_erase(imgs):
pics=imgs.numpy()
if tf.rank(imgs)==4:
X=Parallel(n_jobs=-1)([delayed(random_erasing)(pic) for pic in pics ])
X=np.asarray(X)
elif tf.rank(imgs)==3:
X=random_erasing(pics)
return X
@tf.function
def tf_random_erase(imgs, label):
x = tf.py_function(r_erase,[imgs],[tf.float32])
X = x[0]
X.set_shape(imgs.shape)
return X, label
Data confirmation
tr_ds = tf.data.Dataset.from_tensor_slices((tr_x,tr_y)).shuffle(40000).batch(128)
tr_ds = tr_ds.map(tf_random_erase)
plt.figure(figsize=(10,10),facecolor="white")
show_data(tr_ds)
With CIFAR10, I'm a little overwhelmed, and there are some things I don't understand ...
Here are some things to keep in mind when writing code. It's a basic thing, so I'm sure some of you may think it's something. Surprisingly, there were only pitfalls, so I will write them down here.
tf.data.Dataset.map
There are quite a few pitfalls here, but the behavior during mapping is the Tensor
type.
... what I want to say is that ** Eager Execution does not work on Tensors handled by .map
. ** **
In other words, normal multiplication etc. can be neatly converted to TF type operation with @ tf.function
,
Otherwise, operations that can only be done with real numbers, such as .numpy ()
, cannot be used **. ** **
I think it's easy to understand if you think that it is just described as an expression like x + y = z.
What to use there is to turn on Eager mode.
What to do is to use ** tf.py_function
. ** **
Graph mode is like a formula. This is what I did with Sess.run in the TF1.x series. By designing something like the formula x + y = z and then assigning values to variables (that is, by running Session) 2 + 3 = 5 Z Tensor has a value of 5 for the first time. This is where the TF1.x system was difficult to understand.
Eager Mode is like inputting an expression and outputting the immediately executed value. It runs Sess.run automatically, and Graph is retained, so it seems easy to understand. (I'm not familiar with this area at all, so I hope you can refer to the official guide.) (Please tell me if you make a mistake)
tf.py_function
tf.py_function
is a function that can be partially executed in Eager Mode as described in Guide.
Here, in other words, just set the black box function f (x) and specify only what comes out, and it will be executed in Graph Mode.
It will be like that.
As an expression, the TF side wants an expression such as x + f (a, b) = y, and what is needed here is the data type of the input and output.
py_Pseudocode for function
def function(Input 1,Input 2):
#Here it runs in Eager Mode
#Processing something
return output 1,Output 2
[Output 1,Output 2] = tf.py_function(function,[Input 1,Input 2],[Output 1の型,Output 2の型])
Specifically, it will be like this.
py_function
def function(data1,data2):
return data1+data2,data1*data2
@tf.function
def process(tensor1,tensor2):
[data1,data2]=tf.py_function(function,[tensor1,tensor2],[tf.float32,tf.float32])
return data1, data2
In other words, the function function here is executed in Eager Mode at runtime, so it becomes tf.Tensor
with a value in Tensor
.
tf.Tensor
and Tensor
are different in Eager Mode and Graph Mode, so be careful **
This is done in Eager Mode and tf.Tensor
is brought in, so it's okay to do the first .numpy ()
and return the result in numpy.
This is where many misunderstandings arise.
tf.data.Dataset.map
only works in Graph mode at first. On top of that, some need to run in Eager mode.
This seems to be the behavior of tf.data.Dataset.from_tensor_slices
. (I'm sorry because it is not accurate information)
And when data is discharged, it will be as follows.
If you code with this in mind, you will be able to code smoothly without being confused by mysterious bugs.
With this, I think I have been able to tell you how to develop a general-purpose Data Augmentation.
I want to make something else like this! I hope that those who say that will make it in the same way.
I am really relieved to solve the mystery of py_function
.
Please use all means.
This blog "Data Augmentation Summary of Images in NumPy" was very helpful in implementing it. I would like to take this opportunity to thank you.
Recommended Posts