Aidemy 2020/11/10
Hello, it is Yope! I am a liberal arts student, but I was interested in the possibilities of AI, so I went to the AI-specialized school "Aidemy" to study. I would like to share the knowledge gained here with you, and I am summarizing it on Qiita. I am very happy that many people have read the previous summary article. Thank you! This is the second post for deep learning and image recognition. Nice to meet you.
What to learn this time ・ About convolutional neural networks ・ About model implementation
-__ Convolutional neural network (CNN) __ is a deep learning method mainly used in image processing. -The multi-layer perceptron (MLP) used in Chapter 1 received others by one-dimensional input, but CNN can receive data as it is __ two-dimensional, so two-dimensional data such as images can be used as information. You can learn without losing.
-The convolutional neural network consists of __ "convolutional layer" __ and __ "pooling layer" __. The convolutional layer is a layer that focuses on a part of the input data and examines the characteristics of the image. -The __ pooling layer __ is a layer that reduces the amount of data by reducing the __ output data from the convolution layer __. As a method of reduction, there are __ "Max pooling" __ which takes the maximum value of each feature and __ "Average pooling" __ which takes the average value.
・ Relationship diagram of convolutional layer![Screenshot 2020-11-09 12.26.54.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/698700/4a939228- 1e57-3d36-8fc3-9053d328b145.png)
・ Relationship diagram of pooling layer (Max pooling)![Screenshot 2020-11-09 12.27.33.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/ 698700 / 74a3057a-ac63-e30e-f100-3640c083ed1d.png)
-This time, we will implement the layer with Sequential model. -One of the layer types is __ "Dense" __. This is a layer where all inputs and outputs are connected. In the model below, the fully connected layer is used twice.
-The "convolutional layer" in the previous section can be implemented with __ "Conv2D ()" __. __ "filters" __ is the number of filters, __ "kernel_size" __ is the size of the filter, __ "strides" __ is the distance to move the filters at once, __ "padding" __ is __'same' If __, the input size and output size are aligned, and if 'valid', do not.
-The pooling layer can be implemented with __ "MaxPooling2D ()" __ for Max pooling. __ "pool_size" __ specifies the size of the pooled range, __ "strides" __ specifies the distance to move the filter at one time (strides need not be specified).
-For these input layers, specify the size of the input data with __ "input_dim (one-dimensional)" and "input_shape (multidimensional)" __. For example, in the case of (28 × 28) RGB image data, “input_shape = (28,28,3)”.
-When combining a layer __ that outputs __2 dimensions such as Conv2D and MaxPooling2D and a layer __ that receives ___ 1 dimensional input such as Dense, it is necessary to add __ "flattening layer" __ There is. -The flattening layer can be added with __ "Flatten ()" __, and no __ argument is required __.
-The following code is an example of adding the layers so far. model.add(Conv2D(input_shape=(28, 28, 3), filters=32, kernel_size=(2, 2), strides=(1, 1), padding="same")) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Conv2D(filters=32, kernel_size=(2, 2), strides=(1, 1), padding="same")) model.add(MaxPooling2D(pool_size=(2, 2)))
-Specify the activation function with __ "model_add (Activation ('function name'))" __. There are __ "sigmoid" __ for sigmoid functions, __ "relu" __ for ReLU functions, and __ "softmax" __ used in the output layer.
-After adding a layer to the model, perform compile. In compilation, __ How to change the learning rate __, __ Loss function specification __, __ Accuracy index specification __, etc. are performed. More precisely, compiling is __ "parsing code written in a high-level programming language and pre-converting it into a machine language that can be directly executed by a computer" __. -Compile can be done with __ "model.compile ()" __. __ "optimizer" __ specifies the learning rate, __ "loss" __ specifies the loss function, __ "metrics" __ specifies the accuracy index. ・ There are __ [SGD, RMSprop, Adagrad, Adadelta, Adam] __ etc. for the learning rate, but which one to use is decided exploratory __. -Loss functions include __mean squared_error (mean_squared_error) __, __ cross entropy (category_crossentropy, binary_crossentropy) __, etc. -Metrics is often specified as __ "['accuracy']" __.
-Learning is done with __ "model.fit ()" __. In addition to inputting training data and passing a label to this, __ "batch_size" __ specifies the number of data to be passed at one time, and __ "epochs" __ specifies the number of learnings. The batch_size is specified to perform the __ "mini-batch learning" __ learned at the end of Chapter 1.
・ Code example
-For the trained model, __ test data is used to evaluate the accuracy __. This is done with __ "model.evaluate ()" __. In addition to passing the input data and label of the test data, if verbose = 1, the calculation process is displayed.
-When saving the created model, do as __ "model.save ('model.h5')" __. -When loading this model, use __ "model.load_model ('model.h5')" __.
-Also, in order to allow __ people to view and edit this model, it is necessary to convert it to __ "json" __ format once. This is done with __ "model.to_json ()" __. -To load this, use __ "model_from_json (converted model)" __.
-In image processing, a "convolutional neural network" that can handle two-dimensional image data is used. -As a model implementation, implement by connecting the convolutional layer and the fully connected layer with a flattening layer, and after compiling and learning, evaluate the model.
This time is over. Thank you for reading until the end.
Recommended Posts