[PYTHON] [Explanation for beginners] Introduction to convolution processing (explained in TensorFlow)

Explaining "convolution" processing for beginners

After running the TensorFlow Expert Tutorial Deep MNIST for Experts, the first thing that was confusing was the "convolution" process. It's not a general term for a liberal arts graduate, and it took a while, but it's easy if you understand it. Basically, you can understand if you can do ** four arithmetic operations **. [Explanation for beginners] TensorFlow Tutorial Deep MNIST Please refer to the article that explains the processing of convolutional neural networks. In addition, the article "[Explanation for beginners] Introduction to pooling processing (explained in TensorFlow)" also explains pooling. ^{* Posted with reference to the image output by TensorBoard (2017/7/27) </ sup>}

Overview

As explained in the article "[Explanation for beginners] TensorFlow Tutorial Deep MNIST", the "convolution" process finds the features of the image. Not only images but also sounds and data are fine, but images are more visual and easier to understand, so I will explain using images. 10.Overview.JPG

Input / output of convolution processing

The convolution process filters for features. Using an image as input and using a filter, the number of images for the number of filters is output. When using MNIST data, it is as follows.

When I arranged the actual convolution processing input / output images that I put out on the TensorBoard, it looked like this. Humans can't understand it, but I think they can understand the atmosphere (image processing professionals may understand ...).

Filter example

Regarding filters, please refer to "Image processing filter list, comparison" because it is an editing process for images. It's easy to see what kind of filter you actually apply and how the image changes.

Specific processing

Only use four arithmetic operations. The following program part of the tutorial for TensorFlow experts Deep MNIST for Experts is the convolution process. It is the same for the first layer and the second layer, and it can be implemented at this level by using the TensorFlow API.

#Convolution process
def conv2d(x, W):
    return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')

Image example

As an input image example, use a diagonal bar image of 4 vertical x 4 horizontal x 1 color (black only). The diagonal bar is represented by 1 and 0.

Zero padding

The option that says padding ='SAME' is called zero padding and fills the perimeter with zeros. The reason why I do this is because I want to make the Tensor of the input image and the output image the same.

Use of filters

The figure below shows an example of a diagonal bar filter. Characterize with a diagonal bar at the same angle as the diagonal bar.

The following part of the tutorial for TensorFlow experts Deep MNIST for Experts is the code when the filter is generated (here, 5 (vertical) x 5 (horizontal)) × 1 (color) × 32 (type)).

W_conv1 = weight_variable([5, 5, 1, 32])

The weight_variable function generates the initial value with a normal distribution random number with a standard deviation of 0.1. After that, as you learn, it will search for the optimum filter value. Please refer to link for the function of tf.truncated_normal.

#Weighted value
def weight_variable(shape):
    initial = tf.truncated_normal(shape, stddev=0.1)    #Standard deviation 0.Normal distribution random number of 1
    return tf.Variable(initial)

Multiplication and addition

Multiply and add in the form shown below. After that, shift one by one and repeat the calculation. Since the image is zero-padded, the image of the convolution processing result is the same 4 vertical x 4 horizontal. The picture below is a schematic diagram of the second (second from the left at the top) convolution process.

When the calculation is finally completed, the result will appear in the figure below. Since the diagonal bar filter is applied to the diagonal bar, it has the same shape.

The value was incorrect, so I corrected it (2018/4/2)

reference

"Machine learning that even high school graduates can understand (7) Convolutional neural network part 1" and Mathematical background of TensorFlow Tutorial-Deep MNIST for Experts (part 1) 1) is very detailed.

The mathematical meaning of convolution

Convolution is a synthetic product that appears in mathematics. See Wikipedia for more information. The author studied using the teaching material ["Statistics ** Campus Seminar" (http://amzn.to/2oOPJM2), which has a reputation for being able to develop abilities. For more information on studying mathematics, please refer to the article "How to study mathematics for working adults to understand statistics and machine learning".