Currently, I am an M2 major in CS. I usually write mainly machine learning code in PyTorch. It's easy to write for the time being. PyTorch is said to be ** "Define by Run" **, and what is good is that the calculation graph and the timing of data flow are simultaneous, and it is easy to debug because you can check the calculation result in the middle. I'm such a PyTorch believer, but this time I needed to write code in TensorFlow, so I'll leave it as a memo so that I don't forget the know-how at that time.
Before I touched PyTorch, I was able to understand it because I was touching TensorFlow and keras, but I couldn't grasp the contents after becoming TensorFlow2, so it was a good study this time. By the way, TensorFlow1 (TF1) adopted the method of flowing data to the place where the static graph was created by the method called ** "Define and Run" , but in TensorFlow2 (TF2) ** " It uses the same method as Define by Run " and PyTorch.
This article is for those who want to get started with machine learning with TensorFlow2. Looking at other people's articles, there are many articles about TensorFlow1 and I could not find many articles about TensorFlow2 (TF2), so I hope that it will be helpful for those who are thinking of touching TF2 from now on. Also, rather than the sklearn-like method of learning with
etc., I mainly wrote the expert method in the TensorFlow tutorial, so I hope that those who are interested can read it.
** 1. TensorFlow Basics ** ** 2. Easy model building and learning with keras (Beginner Version) ** ** 3. Transfer learning (+ Fine Tuning) ** ** 4. Build your own model ** ** 5. Model building and learning with TensorFLow2 (Expert Version) **
** [1]. When using your own dataset ** ** [2]. About Augmentaion ** [3]. TensorBoard [4]. TFRecord
First, about the basics.
You can define a constant with tf.constant (value, dtype = None, shape = None, name ='Const', verify_shape = False)
>>> import tensorflow as tf
>>> tf.constant([1, 2, 3])
<tf.Tensor: shape=(3,), dtype=int32, numpy=array([1, 2, 3], dtype=int32)>
>>> b = tf.constant('Hello') #String ok
<tf.Tensor: shape=(), dtype=string, numpy=b'Hello'>
If you specify shape
, all the elements have the same value.
>>> tf.constant(3, shape=[1, 3])
<tf.Tensor: shape=(1, 3), dtype=int32, numpy=array([[3, 3, 3]], dtype=int32)>
--Addition tf.add ()
, subtraction tf.subtract ()
, multiplication tf.mul ()
, division tf.divide ()
>>> tf.add(2,3)
<tf.Tensor: shape=(), dtype=int32, numpy=5>
>>> tf.subtract(5,3)
<tf.Tensor: shape=(), dtype=int32, numpy=2>
>>> tf.multiply(3,4)
<tf.Tensor: shape=(), dtype=int32, numpy=12>
>>> tf.divide(2,3)
Use tf.convert_to_tensor
Use .numpy ()
for a Numpy array.
>>> a = np.asarray([1,2,3])
>>> a
array([1, 2, 3])
>>> a.shape
>>> tf.convert_to_tensor(a)
<tf.Tensor: shape=(3,), dtype=int64, numpy=array([1, 2, 3])>
>>> a
array([1, 2, 3])
>>> c = tf.convert_to_tensor(a)
>>> c
<tf.Tensor: shape=(3,), dtype=int64, numpy=array([1, 2, 3])>
>>> c.numpy()
array([1, 2, 3])
Add dimension number tf.expand_dims (input, axis, name = None)
Used when adding batch size to image size
>>> a = tf.constant([2,3])
>>> a.shape
>>> b = tf.expand_dims(a,0)
>>> b.shape
TensorShape([1, 2])
tf.stack(values, axis=0, name='stack')
>>> x = tf.constant([1, 4])
>>> y = tf.constant([2, 5])
>>> z = tf.constant([3, 6])
>>> tf.stack([x, y, z], axis=0)
<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
array([[1, 4],
[2, 5],
[3, 6]], dtype=int32)>
>>> tf.stack([x, y, z], axis=1)
<tf.Tensor: shape=(2, 3), dtype=int32, numpy=
array([[1, 2, 3],
[4, 5, 6]], dtype=int32)>
tf.concat(values, axis, name='concat')
>>> t1 = [[1, 2, 3], [4, 5, 6]]
>>> t2 = [[7, 8, 9], [10, 11, 12]]
>>> tf.concat([t1,t2],0)
<tf.Tensor: shape=(4, 3), dtype=int32, numpy=
array([[ 1, 2, 3],
[ 4, 5, 6],
[ 7, 8, 9],
[10, 11, 12]], dtype=int32)>
>>> tf.concat([t1,t2],1)
<tf.Tensor: shape=(2, 6), dtype=int32, numpy=
array([[ 1, 2, 3, 7, 8, 9],
[ 4, 5, 6, 10, 11, 12]], dtype=int32)>
Next, I would like to do a classification problem with mnist data.
import tensorflow as tf
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data() #Data reading
x_train, x_test = x_train / 255.0, x_test / 255.0 #Data normalization
#Model building
# Sequential API
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dense(10, activation='softmax')
# optimizer, loss,metrics settings (learning settings)
#Learning, y_train, epochs=5)
test_loss, test_acc = model.evaluate(x_test, y_test, verbose=2)
print('\nTest accuracy:', test_acc)
I think it's a writing style that you often see. This model modifies an image with input of 28x28 into a one-dimensional array 784 using Flatten ()
. After that, connect the fully connected layers, and the last fully connected layer 10
is the number of classes (the number you want to classify). Especially at the end, you can get the probability of each class by specifying softmax
as activation. For learning and evaluation, you can use
, model.evaluate
, model.predict
, etc.
The basic flow is (1) data reading, (2) data preprocessing, (3) model construction, (4) detailed learning settings, (5) learning, and (6) evaluation.
After building a simple model with mnist and learning it, let's try transfer learning next. Transfer learning is the use of weights learned in advance on a large number of images such as imagenet. It has the advantages of shortening the learning time and achieving some accuracy even with a small amount of data. Although transfer learning and Fine Tuning may not be so distinguished, transfer learning keeps the weight of the first layer fixed and learns only the layers that are replaced or added by yourself. Fine Tuning does not fix the weight of the first layer and retrains all parameters.
TensorFlow transfer learning uses tf.keras.applicacctions
to load the model.
import tensorflow as tf
from tensorflow.keras.layers import Dense, GlobalAveragePooling2D
#Input when loading the model_Please specify shape
# include_top =If set to False, the output layer will not be read.
IMG_SIZE = (224,224,3)
base_model = tf.keras.applications.MobileNetV2(input_shape=IMG_SIZE,
include_top = False,
#When fixing the weight of the base model
base_model.trainable = False
#Computational complexity can be reduced by using Global Average Pooling without fully connected layers
GAP_layer = GlobalAveragePooling2D()
#Since it is classified into 10 classes, it is 10. Replace this with your own task.
pred_layer = Dense(10, activation='softmax')
model = tf.keras.Sequential([
# model.summary()Please check the model at
There are other easy-to-use models, see Module: tf.keras.applications.
Although it is MobileNetV2, this model is a very "light model" that works even on edge terminals. A technique called Depthwise Separable Convolution
that calculates the convolution separately in the spatial direction and the channel direction is used, or activation that maximizes the output value after passing ReLU
through the activation function called ReLU6
. It is a very interesting model that uses functions. If you are interested, please check it out.
Starting with TF2, you can use the sub classing API when building your model. Now that you can build a model like PyTorch, I will introduce it.
import tensorflow as tf
from tensorflow.keras.layers import Dense, Flatten, Conv2D
from tensorflow.keras import Model
class Net(Model):
def __init__(self):
super(Net, self).__init__()
self.conv1 = Conv2D(32, 3, activation='relu') #Conv2D(filters, kernel_size, activation)
self.flatten = Flatten()
self.d1 = Dense(128, activation='relu') # Dense(units, activation)
self.d2 = Dense(10, activation='softmax') # Dense(units, activation)
def call(self, x):
x = self.conv1(x)
x = self.flatten(x)
x = self.d1(x)
return self.d2(x)
#Create an instance of the model
model = Net()
--Defining layers to use in __init__
--Conv2D ()
is Conv2D (number of filters, kernel size, activation function)
--Flatten ()
will convert the feature map to one dimension. Example 28x28-> 784
--Dense ()
is a fully connected layer. Specifies the number of dimensions in the output space
--Write layers in the order of data flow with def call (self, x):
--In PyTorch, when you pass data to Conv, you need to pass the number of filters for input and output, but TF2 doesn't have that.
--Of course, you can initialize the kernel and bias, so please refer to Keras Documentation.
For model construction, refer to 4. Build your own model.
#Definition of loss function
loss_object = tf.keras.losses.SparseCategoricalCrossentropy()
#optimizer settings
optimizer = tf.keras.optimizers.Adam()
###Used to calculate loss and accident
train_loss = tf.keras.metrics.Mean(name='train_loss')
train_accuracy = tf.keras.metrics.SparseCategoricalAccuracy(name='train_accuracy')
test_loss = tf.keras.metrics.Mean(name='test_loss')
test_accuracy = tf.keras.metrics.SparseCategoricalAccuracy(name='test_accuracy')
def train_step(images, labels):
with tf.GradientTape() as tape: #Keep a history of calculations to be performed
predictions = model(images) #Put images in the model to get predictions
loss = loss_object(labels, predictions) #Calculate loss from correct label and prediction
gradients = tape.gradient(loss, model.trainable_variables) #Differentiate the loss function with learnable parameters
optimizer.apply_gradients(zip(gradients, model.trainable_variables)) #Updated with gradient information
train_accuracy(labels, predictions)
――Differentiation is absolutely necessary for machine learning
--Use tf.GradientTape
where you want to keep the calculation history
--In other words, tf.GradientTape
is a class for finding the gradient.
import tensorflow as tf
x = tf.constant(3.0)
with tf.GradientTape() as tape:
y = 3x+2
gradient = tape.gradient(y,x)
print(f'y = {y}')
print(f'x = {x}')
print(f'grad = {gradient}')
y = 11.0
x = 3.0
grad = 3.0
Finally, I have posted the code for my train and val. It is full of Tsukkomi places, but please use it as a reference.
def train(model, train_dataset, loss_object, optimizer, train_loss, train_acc, CONFIG, train_count):
cnt = 0
max_value = train_count + CONFIG.batch_size
with progressbar.ProgressBar(max_value=max_value) as bar:
for imgs, labels in train_dataset:
with tf.GradientTape() as tape:
preds = model(imgs, training=True)
loss = loss_object(labels, preds)
gradients = tape.gradient(loss, model.trainable_variables)
optimizer.apply_gradients(zip(gradients, model.trainable_variables))
train_acc.update_state(labels, preds)
cnt = cnt + CONFIG.batch_size
loss_t = train_loss.result().numpy()
acc_t = train_acc.result().numpy()
return loss_t, acc_t
def val(model, val_dataset, loss_object,optimizer, val_loss, val_acc,CONFIG, val_count):
cnt = 0
max_value = val_count + CONFIG.batch_size
with progressbar.ProgressBar(max_value=max_value) as bar:
for imgs, labels in val_dataset:
preds = model(imgs, training=False)
loss = loss_object(labels, preds)
val_acc.update_state(labels, preds)
cnt = cnt + CONFIG.batch_size
loss_v = val_loss.result().numpy()
acc_v = val_acc.result().numpy()
return loss_v, acc_v
-- tqdm
doesn't work well in my environment and I'm using progressbar
--acc and loss are updated with .update_state ()
--acc and loss metrics need to be reset every epoch, so reset with .reset_states ()
--You can specify training
likemodel (imgs, training = False)
, which applies Dropout
when training, but no longer applies Dropout
when testing. Is
When learning with your own data set, I think there are roughly two types.
-(1) Specify a folder and load the image -(2) Read from the image path
--Use flow_from_directory ()
--You need to set target_size, batch_size, class_mode, etc.
--Use in combination with Augmentation
train_aug = ImageDataGenerator(
test_aug = ImageDataGenerator(
train_generator = train_aug.flow_from_directory(
test_generator = test_aug.flow_from_directory(
If you want to export the image path and label to csv and read it, I think this method is often used. I think any library will do when loading images. In the following, I will write until I get the file path of the image and create a dataset from it.
import os
import pathlib
import tensorflow as tf
#Data path specification
data_root_dir = 'data/train'
data_root = pathlib.Path(data_root_dir)
#Get image path
all_image_paths = [str(path) for path in list(data_root.glob('*/*'))]
all_image_paths = sorted(all_image_paths)
#Get label:Get from directory name
label_names = sorted( for item in data_root.glob('*/'))
print(f' label : {label_names}')
#Label dictionary creation dict{label:index}
label_to_index = dict((label, index) for index, label in enumerate(label_names))
#Get labels for all images
all_image_labels = [label_to_index[pathlib.Path(image_path)] for image_path in all_image_paths]
def load_data(all_image_paths, all_image_labels):
img_list = []
for filename in all_image_paths:
#Loading images
img =
img =,channels = 3)
#If you do not resize or resize, an error will occur when creating a dataset.
img = tf.image.resize(img, [224,224])
images = tf.stack(img_list, axis=0)
labels = tf.stack(all_image_labels, axis=0)
return tf.cast(images, tf.float32), tf.cast(labels, tf.int32)
#Get images and labels
imgs, labels = load_data(all_image_paths, all_image_labels)
#Shuffle data to create batches
dataset =, label_list)).shuffle(len(all_image_labels)).batch(8)
for data1, data2 in dataset.take(10):
print(data1, data2)
-** pathlib ** I want you to use it because it is convenient.
--When creating a dataset, you can create it with ()
--Use shuffle ()
if you want to shuffle the dataset
--Use batch ()
if you want to group by batch
--Be careful as the behavior will change if you reverse the order of shuffle
and batch
[2]. Augmentaion
--There is a tf.image that can be used for Augmentaion, so please check it on the official website. --The following code is a simple example. Please use it as a function. ――If you can feel the atmosphere of writing ...
image =,
image = (image / 0.5) -1
# data_aug =If True
if data_aug:
image = tf.image.random_flip_left_right(image=image)
image = tf.image.resize_with_crop_or_pad(image=image,
image = tf.image.random_crop(value=image, size=[CONFIG.img_height,CONFIG.img_width, CONFIG.channels])
image = tf.image.resize(image_tensor, [CONFIG.img_height, CONFIG.img_width])
--For Augmentation, albumentaions is recommended. ――I have a little habit of writing, but it's very convenient. (Cutout etc. are included as standard)
[3]. TensorBoard
--By using TensorBoard, you can easily check the transition of train acuuracy and loss. --Use a simple example to see how to use it
import tensorflow as tf
#Specify the log discharge location
writer = tf.summary.create_file_writer('tmp/mylogs')
with writer.as_default():
for step in range(100):
tf.summary.scalar("acc/train", 0.7, step=step)
tf.summary.scalar("acc/val", 0.5, step=step)
tf.summary.scalar("loss/train", 0.7, step=step)
tf.summary.scalar("loss/val", 0.5, step=step)
--Write in the form tf.summary.scalar (tag, value, step)
――In this example, it is specified by a constant such as 0.5 or 0.7, but you can pass the actual loss or acc.
――In addition to the value, you can also create images, so please try various things.
When checking, please specify the dir of the spit out log as follows and access http: // localhost: 6006 /
$ tensorboard --logdir='./tmp/mylogs'
Serving TensorBoard on localhost; to expose to the network, use a proxy or pass --bind_all
TensorBoard 2.1.1 at http://localhost:6006/ (Press CTRL+C to quit)
--It is one way to write the log to csv and then display it with matplot etc., but you can not confirm it until all the learning is completed. TensorBoard is convenient because you can see it in real time every time the value is updated.
[4]. TFRecord
--TFRecord is a binary version of the data in the format recommended by TensorFlow. --A large amount of data can be serialized and saved in a continuously readable format. --A large amount of data is read sequentially and input to the learner. --TFRecord saves each line information in units called Example ――You can think of it as a map with type information. --The following is an example of saving images and labels with TFRecoed.
def _bytes_feature(value):
if isinstance(value, type(tf.constant(0.))):
value = value.numpy()
return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))
def _float_feature(value):
return tf.train.Feature(float_list=tf.train.FloatList(value=[value]))
def _int64_feature(value):
return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))
def get_info(data_root_dir):
data_root = pathlib.Path(data_root_dir)
all_image_paths = [str(path) for path in list(data_root.glob('*/*'))]
# Get label
label_names = sorted( for item in data_root.glob('*/'))
# dict {label:index}
label_to_index = dict((label, index) for index, label in enumerate(label_names))
# Get all images label
all_image_labels = [label_to_index[pathlib.Path(image_path)] for image_path in all_image_paths]
return all_image_paths, all_image_labels
def dataset_to_tfrecord(dataset_dir, tfrecord_name):
#Get images and labels for each directory
image_paths, image_labels = get_info(dataset_dir)
image_paths_and_labels_dict = {}
#Convert to dictionary type
for i in range(len(image_paths)):
image_paths_and_labels_dict[image_paths[i]] = image_labels[i]
with as writer:
for image_path, label in image_paths_and_labels_dict.items():
image_string = open(image_path, 'rb').read()
feature = {
'label' : _int64_feature(label),
'image' : _bytes_feature(image_string)
tf_example = tf.train.Example(features=tf.train.Features(feature=feature))
For more information, see TensorFLow TFRecord and tf.Example.
We have seen how to classify images with TensorFLow. The Session
and placeholder
that were in TF1 have disappeared, the ʻEager Mode` has been defaulted, and keras has become a high-level API of the TensorFLow standard, making it easier to use personally. Was there. It's hard to get used to, but once I got used to it, TensorFLow was easy to write. In the future, I would like to be able to master both Pytorch and TensorFlow. There are many useful libraries such as ** pathlib ** and ** albumentaions ** introduced in the middle of the article, so if you are not using it, please use it.
Recommended Posts