For those who are not familiar with the numpy.pad function that you see while studying convolutional neural networks (CNN) in deep learning.
Official document will be crushed and translated into Japanese.
-[What is the pad function](What is the #pad function)
-[About the first argument](# About the first argument)
-
pad function?The pad function that appears on CNN behaves quite confusingly, right?
Most books aren't the main ones
pad_example.py
x = np.pad(x, [(0, 0), (0, 0), (pad, pad), (pad, pad)], "constant")
If so, I think it's only OK. So, I will thoroughly dissect this function. In the official documentation
numpy.pad(array, pad_width, mode='constant', **kwargs)
You wrote that the argument is specified like this. Let's look at each one first.
First, let's take a look at the official documentation.
array : array_like of rank N The array to pad.
Translated into Japanese
array: An array of rank N or something similar Array to pad
It will be. Rank (rank) is a technical term for linear algebra, and I think it's okay to recognize it as a dimension number here ... For more information, see here and [here]( Please see https://deepage.net/features/numpy-rank.html).
For the time being, you should know about this. Specify the array to be padded.
Well, the problem is the second argument.
pad_width : {sequence, array_like, int} Number of values padded to the edges of each axis. ((before_1, after_1), ..., (before_N, after_N)) unique pad widths for each axis. ((before, after),) yields same before and after pad for each axis. (pad,) or int is a shortcut for before = after = pad width for all axes.
I will translate it into Japanese.
pad_width: {sequence, array or similar, integer} The number of numbers padded at the end of each dimension. ((before_1, after_1), ..., (before_N, after_N)): Specify the padding width (before_i, after_i) specific to each dimension. ((before, after),): Specify the same padding width (before, after) for each dimension. (pad,) or integer: Specify the same padding width (before = after = pad) for all dimensions.
Well, the meaning is hard to understand. Let's take a look at the implementation as well.
pad_example.py
import numpy as np
x_1d = np.arange(1, 3 + 1)
print(x_1d)
Let's start with a one-dimensional array. Try as specified in each document. First of all
((before_1, after_1), ..., (before_N, after_N))
is not it.
pad_example.py
print(np.pad(x_1d, ((1, 1))))
print(np.pad(x_1d, ((2, 1))))
print(np.pad(x_1d, ((1, 2))))
You can understand this somehow, right? Since it is one-dimensional, only one `tuple` can be specified, for $ 0 $ each to the left of the array by the number specified by` before_1` and to the right of the array by the number specified by ʻafter_1`. It's padded.
By the way, I intend to write it in double tuples, but in fact Python treats it in the same way as single tuples.
continue
((before, after),)
Let's do it.
pad_example.py
print(np.pad(x_1d, ((1, 1),)))
print(np.pad(x_1d, ((2, 1),)))
print(np.pad(x_1d, ((1, 2),)))
Yes, the results are the same. Here, the argument is explicitly sent as a double tuple.
Finally
(Pad,) or an integer
Let's do it.
pad_example.py
print(np.pad(x_1d, (1,)))
print(np.pad(x_1d, (2,)))
print(np.pad(x_1d, 1))
print(np.pad(x_1d, 2))
The specified number of $ 0 $ is filled in both ends. With this specification method, the same number of pads will be padded at both ends.
Next, let's try a two-dimensional array.
pad_example.py
x_2d = np.arange(1, 3*3 + 1).reshape(3, 3)
print(x_2d)
print(np.pad(x_2d, ((1, 1), (2, 2))))
print(np.pad(x_2d, ((2, 2), (1, 1))))
print(np.pad(x_2d, ((1, 2), (1, 2))))
print(np.pad(x_2d, ((2, 1), (1, 2))))
print(np.pad(x_2d, ((1, 1),)))
print(np.pad(x_2d, ((1, 2),)))
print(np.pad(x_2d, ((2, 1),)))
print(np.pad(x_2d, ((2, 2),)))
print(np.pad(x_2d, (1,)))
print(np.pad(x_2d, (2,)))
print(np.pad(x_2d, 1))
print(np.pad(x_2d, 2))
Result of ((before_i, after_i))



Result of ((before, after),)



Result of (pad,)

Integer result

Now, in the case of 2D, it is first padded in the 1st dimension row (upper and lower), and then in the 2nd dimension column (left and right). Other than that, it's the same as in one dimension.
As you can see, I will skip 3D and experiment in 4D. ** It is recommended to uncomment one by one and execute. It is very difficult to see because the output becomes long vertically. ** **
pad_example.py
def print_4darray(x):
first, second, third, fourth = x.shape
x_str_size = len(str(np.max(x)))
for i in range(first):
for k in range(third):
for j in range(second):
str_size = len(str(np.max(x[i, j, k, :])))
if x_str_size != str_size:
add_size = "{: " +str(x_str_size - str_size)+ "d}"
np.set_printoptions(
formatter={'int': add_size.format})
else:
np.set_printoptions()
print(x[i, j, k, :], end=" ")
print()
print()
x_4d = np.arange(1, 3*3*3*3 + 1).reshape(3, 3, 3, 3)
print_4darray(x_4d)
print_4darray(np.pad(x_4d, ((1, 1), (2, 2), (0, 0), (0, 0))))
print_4darray(np.pad(x_4d, ((0, 0), (0, 0), (2, 2), (1, 1))))
print_4darray(np.pad(x_4d, ((1, 1), (0, 0), (2, 2), (0, 0))))
print_4darray(np.pad(x_4d, ((0, 0), (1, 1), (0, 0), (2, 2))))
print_4darray(np.pad(x_4d, ((0, 0), (1, 1), (2, 2), (0, 0))))
print_4darray(np.pad(x_4d, ((1, 1), (0, 0), (0, 0), (2, 2))))
#print_4darray(np.pad(x_4d, ((1, 1),)))
#print_4darray(np.pad(x_4d, ((1, 2),)))
#print_4darray(np.pad(x_4d, ((2, 1),)))
#print_4darray(np.pad(x_4d, ((2, 2),)))
#print_4darray(np.pad(x_4d, (1,)))
#print_4darray(np.pad(x_4d, (2,)))
#print_4darray(np.pad(x_4d, 1))
#print_4darray(np.pad(x_4d, 2))
Although it is a print_4darray function, it loops in the order of the 1st dimension, 3rd dimension, and 2nd dimension, and outputs the 4th dimension with the print function. At this time, ʻend =" "is used to output a half-width space instead of a line break. After that, I use line breaks for adjustment and thenp.set_printoptions` function to control whitespace at the time of output.
I created it because it is hard to see in the standard output of numpy.
By the way, when I run the code, there are probably some things that don't fit on the screen. The image is a stack of multiple screenshots. Lol Also, I also widened the cell width of jupyter notebook.
Let's also look at the third argument. Because it is long, each part.
modestr or function, optional One of the following string values or a user supplied function.
An optional argument that specifies a string or function that specifies the mode. One of the strings below, or the user specifies the function.
There is no particular problem with the explanation of the arguments themselves. The user-specified functions will be described later.
‘constant’ (default) Pads with a constant value.
constant(default) Pad with a constant (0).
‘edge’ Pads with the edge values of array.
edgePad with the values at the ends of the matrix.
pad_example.py
print(np.pad(x_2d, 1, "edge"))
This is an example of padding in a two-dimensional array.
The $ 3 \ times 3 $ element in the center was the original array. From the edge of the array after padding, it goes vertically, horizontally, and diagonally toward the center, copying the value that was first encountered.
‘linear_ramp’ Pads with the linear ramp between end_value and the array edge value.
linear_rampPad with a ramp function between the last value and the edge value.
pad_example.py
print(np.pad(x_2d, 3, "linear_ramp"))
Even if you actually move it, it's hard to understand at first glance lol. Let's divide it into blocks.
I can't read the behavior of red blocks font>, but can't you see between blocks of other colors?
Take light blue block font> as an example.
Focusing on the vertical and horizontal directions of $ 3 $, it is $ 3210 $ toward the end value of $ 0 $. As an image, it feels like $ 0 \ le x \ le 3 $ is divided into 4 equal parts and truncated. Here, it is divided into equal parts like $ 0, 1, 2, 3 $, so it appears as it is.
Notice the green block font>. The same rules can be applied here as well. Dividing $ 0 \ le x \ le 7 $ into four equal parts gives $ 0, 2. \ dot3, 4. \ dot6, 7 $, and $ 0, 2, 4, 7 $ appears.
Also, for the other elements of each block, the values determined in the above example are arranged diagonally in a strip. The value at the end is $ 0 $.
‘maximum’ Pads with the maximum value of all or part of the vector along each axis.
maximumPads with the maximum value of all or part of the vector for each axis.
Is it subtle? ?? It feels like, but if you understand the rules, you will be satisfied.
The purple block font> is the original array. I will put padding on this.
First, the vertical font> horizontal font> values are padded with the maximum value in each block. And after all of them are done, the value of corner font> is padded with the maximum value in the block.
‘mean’ Pads with the mean value of all or part of the vector along each axis.
meanPads with the average value of all or part of the vectors for each axis.
pad_example.py
print(np.pad(x_2d, 1, "mean"))
‘median’ Pads with the median value of all or part of the vector along each axis.
medianPads at the median of all or part of the vector for each axis.
pad_example.py
print(np.pad(x_2d, 1, "median"))
‘minimum’ Pads with the minimum value of all or part of the vector along each axis.
minimumPads with the minimum value of all or part of the vector for each axis.
pad_example.py
print(np.pad(x_2d, 1, "minimum"))
‘reflect’ Pads with the reflection of the vector mirrored on the first and last values of the vector along each axis.
reflectPads with the reflection of the vector that copied the first and last values of the vector for each axis.
pad_example.py
print(np.pad(x_2d, 2, "reflect"))
It's hard to understand ... but somehow I don't understand. Divide into blocks as in the example.
How about this? It's padded symmetrically with respect to the values located between blocks of the same color.
Focusing on the light blue block font>, you can see that $ 1 $ is the center and $ 4, 7 $ below it is padded with a "reflected vector".
The green block font> has a center of $ 3 $ and the $ 1, 2 $ to the left of it is padded with a "reflected vector" and In the red block font>, $ 5, 6, 8, 9 $ is padded with a "reflected vector" centered on $ 1 $.
‘symmetric’ Pads with the reflection of the vector mirrored along the edge of the array.
symmetricPad with the reflection of the vector along the edge of the array.
pad_example.py
print(np.pad(x_2d, 2, "symmetric"))
The biggest difference from `reflect` is whether the value at the end of the original array is" reflected "or" reflected without it ".
‘wrap’ Pads with the wrap of the vector along the axis. The first values are used to pad the end and the end values are used to pad the beginning.
wrapPad with a vector wrap along the axis. The first value is used to pad the last and the last value is used to pad the first.
pad_example.py
print(np.pad(x_2d, 2, "wrap"))
It's obvious when you look at it like this. A non-reflective version of `reflect`. I want you to write that as an official document anymore ...
‘empty’ Pads with undefined values. New in version 1.17.
emptyPads with an indefinite value. Added in version 1.17 of numpy.
pad_example.py
import numpy as np
print(np.pad(np.arange(1, 3*3+1).reshape(3, 3), 2, "empty"))
print(np.pad(np.arange(1, 3*3+1).reshape(3, 3), 5, "empty"))
I'm creating a new notebook for experimentation (not a new cell).
In my environment it looks like the above. Up to a padding width of 4 with the ʻempty` command, $ 0 $ padding is output, and above $ 5 $, an indefinite value, probably remaining in the allocated memory destination, is output. By the way, the image with padding width of $ 5 $ is only a part because it can't be helped to take the whole image.
<function> Padding function, see Notes.
Notes New in version 1.7.0. For an array with rank greater than 1, some of the padding of later axes is calculated from padding of previous axes. This is easiest to think about with a rank 2 array where the corners of the padded array are calculated by using padded values from the first axis.
The padding function, if used, should modify a rank 1 array in-place. It has the following signature:
padding_func(vector, iaxis_pad_width, iaxis, kwargs) where
vector: ndarray A rank 1 array already padded with zeros. Padded values are vector[:iaxis_pad_width[0]] and vector[-iaxis_pad_width[1]:].
iaxis_pad_width: tuple A 2-tuple of ints, iaxis_pad_width[0] represents the number of values padded at the beginning of vector where iaxis_pad_width[1] represents the number of values padded at the end of vector.
iaxis: int The axis currently being calculated.
kwargs: dict Any keyword arguments the function requires.
<function> Padding function. See note.
Notes Added in version 1.7.0 of numpy. Due to the rank 1 or higher array, some higher-order padding is calculated from the lower-order padding. This is most obvious when you consider using the padding values you have already applied to determine the corner elements of an array that has been padded for a two-dimensional array.
When using the padding function, it is necessary to change the one-dimensional array by the prescribed method. It looks like this:
padding_func(vector, iaxis_pad_width, iaxis, kwargs)For each argument
vector: ndarray The one-dimensional array is already padded with 0. The padded values arevector [: iaxis_pad_width [0]]andvector [-iaxis_pad_width [1]:].
iaxis_pad_width: tuple In a double tuple of integers, ʻiaxis_pad_width [0]represents the number of values padded at the beginning of the vector and ʻiaxis_pad_width [1]represents the number of values padded at the end of the vector.
iaxis: int The dimension currently being calculated.
kwargs: dict Some keyword arguments required by the function.
pad_example.py
def pad_with(vector, pad_width, iaxis, kwargs):
pad_value = kwargs.get('padder', 10)
vector[:pad_width[0]] = pad_value
vector[-pad_width[1]:] = pad_value
print(np.pad(x_2d, 2, pad_with))
print(np.pad(x_2d, 2, pad_with, padder=100))
`vector`,` pad_width` and ʻiaxis` are passed automatically. If the user wants to specify another argument, pass it as a keyword argument to the `numpy.pad` function and retrieve it in the padding function (`pad_value = kwargs.get ('padder', 10) `).
stat_length: sequence or int, optional Used in ‘maximum’, ‘mean’, ‘median’, and ‘minimum’. Number of values at edge of each axis used to calculate the statistic value. ((before_1, after_1), … (before_N, after_N)) unique statistic lengths for each axis. ((before, after),) yields same before and after statistic lengths for each axis. (stat_length,) or int is a shortcut for before = after = statistic length for all axes. Default is None, to use the entire axis.
stat_length: Sequence or integer, optional. Options that can be specified withmaximum,mean,median, andminimum. The number of values at the end of each dimension is used to calculate the statistics. In((before_1, after_1),… (before_N, after_N)), the statistical width is specified individually for each dimension.((before, after),)uses the same stats for each dimension.(stat_length,)or an integer is a shortcut for using thebefore = afterstatistic width for all dimensions. The default isNone, which is used for all dimensions.
pad_example.py
print(np.pad(x_2d, 1, "maximum", stat_length=2))
As you can see by comparing it with the output result of `maximum`, the size of the vector that takes the maximum value is $ 2 $ instead of $ 3 $ (whole).
constant_values: sequence or scalar, optional Used in ‘constant’. The values to set the padded values for each axis. ((before_1, after_1), ... (before_N, after_N)) unique pad constants for each axis. ((before, after),) yields same before and after constants for each axis. (constant,) or constant is a shortcut for before = after = constant for all axes. Default is 0.
constant_values: Sequence or real number, optional. This option can be specified withconstant. You can set the padding value for each dimension.((before_1, after_1), ... (before_N, after_N))sets constants for padding individually for each dimension.((before, after),)sets the same padding constants for each dimension. The(constant,)or constant is a shortcut that applies a constantbefore = afterto all dimensions. The default is $ 0 $.
pad_example.py
print(np.pad(x_2d, 1, "constant", constant_values=(-1, -2),))
end_values: sequence or scalar, optional Used in ‘linear_ramp’. The values used for the ending value of the linear_ramp and that will form the edge of the padded array. ((before_1, after_1), ... (before_N, after_N)) unique end values for each axis. ((before, after),) yields same before and after end values for each axis. (constant,) or constant is a shortcut for before = after = constant for all axes. Default is 0.
ʻEnd_values
: Sequence or real number, optional. This option can be specified withlinear_ramp. Sets the last value in the linear_ramp function and fills the end value with the specified value.((before_1, after_1), ... (before_N, after_N))sets each dimension individually.((before, after),)sets the same for each dimension. A(constant,)or constant is a shortcut that applies a value ofbefore = after` to all dimensions.
pad_example.py
print(np.pad(x_2d, 3, "linear_ramp", end_values=((-1, -2), (-3, -4))))
Something like this~
reflect_type: {‘even’, ‘odd’}, optional Used in ‘reflect’, and ‘symmetric’. The ‘even’ style is the default with an unaltered reflection around the edge value. For the ‘odd’ style, the extended part of the array is created by subtracting the reflected values from two times the edge value.
reflect_type: ʻevenor ʻodd, optional. Options that can be specified withreflectandsymmetric. ʻEvenis the default style, with an invariant reflection around the edge value. In the ʻoddstyle, the value of the padding part of the array is determined by subtracting the reflected value from twice the value at the end.
pad_example.py
print(np.pad(x_2d, 2, "reflect", reflect_type="odd"))
You can see this result by comparing it with the explanation.
Focusing on the light blue block font>, it is the same as ʻeven` (default) in that it is centered on $ 1 $, but the padding value is completely different.
Let's calculate according to the explanation.
The description says, "It is determined by subtracting the reflected value from twice the edge value", so from $ 2 $, which is twice the edge value $ 1 $, the reflected value $ 7,4 $ If you subtract, it will be $ -5, -2 $, which matches the output image!
The same is true for red blocks font>.
If you subtract the reflected value of $ 2, 1 $ from $ 6 $, which is twice the edge value of $ 3 $, you get $ 4, 5 $.
pad_example.py
print(np.pad(x_2d, 2, "constant").base)
#The output will be None.
The base attribute returns None if the array is original (no memory is shared), otherwise it returns the value of the array.
By the way, this article thoroughly explains the ʻim2col` function, but the code that appears here
im2col.py
pad_zero = (0, 0)
O_h = int(np.ceil((I_h - F_h + 2*pad_ud)/stride_ud) + 1)
O_w = int(np.ceil((I_w - F_w + 2*pad_lr)/stride_lr) + 1)
pad_ud = int(np.ceil(pad_ud))
pad_lr = int(np.ceil(pad_lr))
pad_ud = (pad_ud, pad_ud)
pad_lr = (pad_lr, pad_lr)
images = np.pad(images, [pad_zero, pad_zero, pad_ud, pad_lr], \
"constant")
There is a part called.
You already know what the pad function here is doing.
Since the 1st and 2nd dimensions are pad_zero, there is no padding, and the 3rd and 4th dimensions are padded only with pad_ud and pad_lr, respectively. The whole bundle doesn't have to be a tuple type, isn't it?
The 1st and 2nd dimensions are batches and the number of channels, and the 3rd and 4th dimensions are image data, so you can understand that only the area around the image is padded.
The pad function is deep ...
-[Meaning of rank of matrix (equivalent definition of 8 ways)](https://mathtrain.jp/matrixrank#:~:text=%E3%83%A9%E3%83%B3%E3%82%AF % EF% BC% 88% E9% 9A% 8E% E6% 95% B0% EF% BC% 8Crank% EF% BC% 89% E3% 81% A8,% E3% 82% 92% E5% 8F% 82% E7% 85% A7% E3% 81% 97% E3% 81% A6% E4% B8% 8B% E3% 81% 95% E3% 81% 84% EF% BC% 89% E3% 80% 82) -How to use the linalg.matrix_rank function to find the rank with NumPy
-Introduction to Deep Learning ~ Basics ~ -Introduction to Deep Learning ~ Coding Preparation ~ -Introduction to Deep Learning ~ Forward Propagation ~ -Introduction to Deep Learning ~ Backpropagation ~ -List of activation functions (2020) -Thorough understanding of im2col -Complete understanding of numpy.pad function
Recommended Posts