[PYTHON] [Pytorch] MaxPool2d ceil_mode

What is ceil_mode

If you want to port Pytorch's trained models torchvision.models.googlenet to Keras, you may be curious.

What is the ceil_mode of MaxPool2d?

Looking at the documentation, it says, "If True, use ceil instead of floor in calculating the output shape."

torch.nn — PyTorch master documentation

ceil_mode – when True, will use ceil instead of floor to compute the output shape

Below is MaxPool2D, which first appears on ** torchvision.models.googlenet **.

#Input is(112, 112, 64)
MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=True)

When I calculate the output size, ** 55.5 **

output\_shape = \frac{input\_shape + 2 \times padding - kernel\_size}{stride} + 1 \\
= \frac{112 + 2 \times 0 - 3}{2} + 1 = 55.5

Looking at the actual output size with torch summary, it is ** (ch = 64, 56, 56) **, so it certainly seems that the decimal point is rounded up (ceil).

MaxPool2d-4           [-1, 64, 56, 56]

Let's check the result of PyTorch

Insert the following sample data of (10,10) size into MaxPool2d of kernel = (3,3), stride = (2,2) and see the result.

Sample data

import torch
import torch.nn as nn

>>> x = torch.arange(1, 101).view(1, 10, 10).float()
>>> x
tensor([[[  1.,   2.,   3.,   4.,   5.,   6.,   7.,   8.,   9.,  10.],
         [ 11.,  12.,  13.,  14.,  15.,  16.,  17.,  18.,  19.,  20.],
         [ 21.,  22.,  23.,  24.,  25.,  26.,  27.,  28.,  29.,  30.],
         [ 31.,  32.,  33.,  34.,  35.,  36.,  37.,  38.,  39.,  40.],
         [ 41.,  42.,  43.,  44.,  45.,  46.,  47.,  48.,  49.,  50.],
         [ 51.,  52.,  53.,  54.,  55.,  56.,  57.,  58.,  59.,  60.],
         [ 61.,  62.,  63.,  64.,  65.,  66.,  67.,  68.,  69.,  70.],
         [ 71.,  72.,  73.,  74.,  75.,  76.,  77.,  78.,  79.,  80.],
         [ 81.,  82.,  83.,  84.,  85.,  86.,  87.,  88.,  89.,  90.],
         [ 91.,  92.,  93.,  94.,  95.,  96.,  97.,  98.,  99., 100.]]])
>>> x.shape
torch.Size([1, 10, 10])

ceil_mode = False padding = 1

>>> nn.MaxPool2d((3,3), stride=2, padding=1, ceil_mode=False)(x)               

#Output size(5, 5)
tensor([[[ 12.,  14.,  16.,  18.,  20.],
         [ 32.,  34.,  36.,  38.,  40.],
         [ 52.,  54.,  56.,  58.,  60.],
         [ 72.,  74.,  76.,  78.,  80.],
         [ 92.,  94.,  96.,  98., 100.]]])
output\_shape = \frac{input\_shape + 2 \times padding - kernel\_size}{stride} + 1 \\
= \frac{10 + 2 \times 1 - 3}{2} + 1 = 5.5

Truncate after the decimal point, 5.5 → 5

padding = 0

>>> nn.MaxPool2d((3,3), stride=2, padding=0, ceil_mode=False)(x) 

#Output size(4, 4)
tensor([[[23., 25., 27., 29.],
         [43., 45., 47., 49.],
         [63., 65., 67., 69.],
         [83., 85., 87., 89.]]])
output\_shape = \frac{input\_shape + 2 \times padding - kernel\_size}{stride} + 1 \\
= \frac{10 + 2 \times 0 - 3}{2} + 1 = 4.5

Truncate after the decimal point, 4.5 → 4

ceil_mode = True padding = 1

>>> nn.MaxPool2d((3,3), stride=2, padding=1, ceil_mode=True)(x) 

#Output size(6, 6)
tensor([[[ 12.,  14.,  16.,  18.,  20.,  20.],
         [ 32.,  34.,  36.,  38.,  40.,  40.],
         [ 52.,  54.,  56.,  58.,  60.,  60.],
         [ 72.,  74.,  76.,  78.,  80.,  80.],
         [ 92.,  94.,  96.,  98., 100., 100.],
         [ 92.,  94.,  96.,  98., 100., 100.]]])
output\_shape = \frac{input\_shape + 2 \times padding - kernel\_size}{stride} + 1 \\
= \frac{10 + 2 \times 1 - 3}{2} + 1 = 5.5

Round up after the decimal point, 5.5 → 6

padding = 0

>>> nn.MaxPool2d((3,3), stride=2, padding=0, ceil_mode=True)(x)  

#Output size(5, 5)
tensor([[[ 23.,  25.,  27.,  29.,  30.],
         [ 43.,  45.,  47.,  49.,  50.],
         [ 63.,  65.,  67.,  69.,  70.],
         [ 83.,  85.,  87.,  89.,  90.],
         [ 93.,  95.,  97.,  99., 100.]]])
output\_shape = \frac{input\_shape + 2 \times padding - kernel\_size}{stride} + 1 \\
= \frac{10 + 2 \times 0 - 3}{2} + 1 = 4.5

Round up after the decimal point, 4.5 → 5

Difference between true / False of ceil_mode

The following output sizes are all (5, 5), but what is the difference?

padding=1, ceil_mode=False

State of Max Pooling

image.png

output

image.png

padding=0, ceil_mode=True

State of Max Pooling

Since there is no padding, pooling is performed from the upper left. By rounding up the output shape, the result is the same as padding only the right and bottom.

image.png

output

image.png

Let's check the result of Keras

Insert the following sample data of (10,10) size into MaxPool2d of kernel = (3,3), stride = (2,2) and see the result. Keras' MaxPooling2D doesn't have a ceil_mode parameter.

It seems that Keras always truncates the calculation result of the output shape after the decimal point (** ceil_mode = False ** in Pytorch).

Sample data

As with Pytorch, generate 10x10 data.

from tensorflow.keras.layers import MaxPooling2D
import numpy as np

x = np.arange(1, 101).reshape(1, 10, 10, 1).astype(np.float)

padding=1 Same output as ** padding = 1, ceil_mode = False ** in Pytorch.

>>> out = MaxPooling2D((3,3), strides=(2,2))(ZeroPadding2D((1,1))(x))
>>> out = tf.transpose(out, perm=[0,3,1,2])
>>> with tf.Session() as sess:  
>>>     out_value = sess.run(out)
>>>     print(out_value)

#Output size(5, 5)
[[[[ 12.  14.  16.  18.  20.]
   [ 32.  34.  36.  38.  40.]
   [ 52.  54.  56.  58.  60.]
   [ 72.  74.  76.  78.  80.]
   [ 92.  94.  96.  98. 100.]]]]

padding=0 Same output as ** padding = 0, ceil_mode = False ** in Pytorch.

>>> out = MaxPooling2D((3,3), strides=(2,2))(x)
>>> out = tf.transpose(out, perm=[0,3,1,2])
>>> with tf.Session() as sess:  
>>>     out_value = sess.run(out)
>>>     print(out_value)

#Output size(4, 4)
[[[[23. 25. 27. 29.]
   [43. 45. 47. 49.]
   [63. 65. 67. 69.]
   [83. 85. 87. 89.]]]]

How to get the same output as ceil_mode = True with keras?

When ZeroPadding2D is set as follows, zero padding is performed vertically and horizontally.

ZeroPadding2D((1,1))(x)

image.png

It is also possible to change the padding settings for top and bottom, left and right, as shown below. (Zero padding is applied only to the bottom and right)

ZeroPadding2D(((0,1), (0,1)))(x)

image.png

By applying zero padding only to the bottom and right, we were able to get the same output as ceil_mode = True.

>>> out = MaxPooling2D((3,3), strides=(2,2))(ZeroPadding2D(((0,1), (0,1)))(x))
>>> out = tf.transpose(out, perm=[0,3,1,2])
>>> with tf.Session() as sess:  
>>>     out_value = sess.run(out)
>>>     print(out_value)

#Output size(5, 5)
[[[[ 23.  25.  27.  29.  30.]
   [ 43.  45.  47.  49.  50.]
   [ 63.  65.  67.  69.  70.]
   [ 83.  85.  87.  89.  90.]
   [ 93.  95.  97.  99. 100.]]]]

Recommended Posts

[Pytorch] MaxPool2d ceil_mode
Install pytorch
PyTorch Links
Practice Pytorch
Install PyTorch