[PYTHON] Count the number of parameters in the deep learning model

1.First of all

I wondered how to count the parameters of deep learning, so I calculated it to confirm my understanding.

2. Model configuration

Let's configure the model using Keras. The model created this time will be a model that takes 256x256 RBG images as input and classifies them into 9 categories.

Import the required modules.

from keras.models import Sequential
from keras.layers.convolutional import Conv2D
from keras.layers.convolutional import MaxPooling2D
from keras.layers.core import Activation
from keras.layers.core import Flatten, Dropout
from keras.layers.core import Dense

Define the number of classes to classify as a constant,

num_class = 9

Configure the model.

#Creating a model
model = Sequential()

model.add(Conv2D(32, kernel_size=3, padding="same", activation='relu', input_shape=(256, 256, 3)))
model.add(Conv2D(32, kernel_size=3, padding="valid", activation='relu'))
model.add(MaxPooling2D(pool_size=(3, 3)))

model.add(Conv2D(32, kernel_size=3, padding="same", activation='relu'))
model.add(Conv2D(32, kernel_size=3, padding="valid", activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(32, kernel_size=3, padding="same", activation='relu'))
model.add(Conv2D(32, kernel_size=3, padding="valid", activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Flatten())    #Flatten()Convert feature map to vector by
model.add(Dense(512, activation='relu'))
model.add(Dense(128, activation='relu'))
model.add(Dense(num_class, activation='softmax'))    #Output as 9 classes of accuracy with Softmax function

Outputs model information.

model.summary()        #Display model information

You will get the following output: The number of parameters is output on the far right side of this. In this model, 6,029,097 parameters will be adjusted by training.

3. Calculation of the number of parameters

3-1. CNN layer

First, let's look at the CNN layer of the first layer. The number of filters: 32, filter size: 3x3, input channel: 3 (RGB), output channel: 3 are specified.

model.add(Conv2D(32, kernel_size=3, padding="same", activation='relu', input_shape=(256, 256, 3)))
conv2d_1 (Conv2D)            (None, 256, 256, 32)      896       

The number of parameters can be calculated by the following formula. Number of parameters = Vertical filter size x Horizontal filter size x Number of input channels x Number of output channels + Bias x Number of output channels param =3 x 3 x 3 x 32 + 1 x 32 = 896

Let's calculate the second layer in the same way.

model.add(Conv2D(32, kernel_size=3, padding="valid", activation='relu', input_shape=(256, 256, 3)))
conv2d_2 (Conv2D)            (None, 254, 254, 32)      9248      

This time, the input to the second layer is 32 channels, so

Number of parameters = Vertical filter size x Horizontal filter size x Number of input channels x Number of output channels + Bias x Number of output channels param =3 x 3 x 32 x 32 + 1 x 32 = 9248

The 3rd, 4th, 5th and 6th Conv2D layers can be calculated in the same way.

3-2. Flatten layer

The feature map is vectorized. It is dropped to one dimension. It is not a parameter adjusted by learning here,

dropout_3 (Dropout)          (None, 19, 19, 32)        0         

The dimension of the vector is 19 x 19 x 32 = 11552.

3-3. Dense layer (hidden layer)

In the Dense layer next to the Flatten layer that vectorizes the features Because the number of parameters = input size x output size + bias param = 11552 x 512 + 512 = 5915136

The next hidden layer is the same dense_2 (Dense) param = 512 x 128 + 512 = 65664 dense_3 (Dense) param = 128 x 9 + 9 = 1161

4. Summary

Check the model you built again. If you add the value of Param on the far right, it becomes 6,029,097. This parameter has been adjusted in training. Then, the parameters adjusted by learning become a part of the model, and inference, these parameters are used for calculation unless the model is made lighter.

Layer (type)                 Output Shape              Param #   
conv2d_1 (Conv2D)            (None, 256, 256, 32)      896       
conv2d_2 (Conv2D)            (None, 254, 254, 32)      9248      
max_pooling2d_1 (MaxPooling2 (None, 84, 84, 32)        0         
dropout_1 (Dropout)          (None, 84, 84, 32)        0         
conv2d_3 (Conv2D)            (None, 84, 84, 32)        9248      
conv2d_4 (Conv2D)            (None, 82, 82, 32)        9248      
max_pooling2d_2 (MaxPooling2 (None, 41, 41, 32)        0         
dropout_2 (Dropout)          (None, 41, 41, 32)        0         
conv2d_5 (Conv2D)            (None, 41, 41, 32)        9248      
conv2d_6 (Conv2D)            (None, 39, 39, 32)        9248      
max_pooling2d_3 (MaxPooling2 (None, 19, 19, 32)        0         
dropout_3 (Dropout)          (None, 19, 19, 32)        0         
flatten_1 (Flatten)          (None, 11552)             0         
dense_1 (Dense)              (None, 512)               5915136   
dropout_4 (Dropout)          (None, 512)               0         
dense_2 (Dense)              (None, 128)               65664     
dropout_5 (Dropout)          (None, 128)               0         
dense_3 (Dense)              (None, 9)                 1161      
Total params: 6,029,097
Trainable params: 6,029,097
Non-trainable params: 0

