[Python] Reason for dtype "int32" in Numpy (Windows environment) (as of September 2020)

I was curious when I was studying Numpy. For example, create an array of Numpy as follows and check the data type with "dtype". Then, "int32" is returned as follows.

image.png

I was wondering here.

"Why is the default int32 (32bit)? </ Font>

■ Does it depend on the number of bits in the OS environment? ⇒ NO

At first I wondered if it depends on the number of bits in the OS. However, my environment is 64bit as below. It does not depend on the number of bits in the OS environment.

image.png

■ Does it depend on the number of bits in the Python environment? ⇒ NO

Next, I wondered if it depends on the number of bits in the Python environment. Check the number of bits in the Python environment using maxsize of the sys module.

32bit : 2147483647 64bit : 9223372036854775807

image.png

As mentioned above, the result is "9223372036854775807", so the Python environment is "64bit". It does not depend on the number of bits in the Python environment.

■ Conclusion ⇒ It is a long type specification in C language.

"Why is the default int32 (32bit)? </ Font>

To understand the cause, I had to remember Numpy's assumptions.

"The inside of NumPy is implemented by C language (and Fortran), so it runs very fast."

Numpy is implemented in C and np.int_ is defined in C long.

Reference: "Numpy data type"

According to Microsoft, the default long type is 4 bytes (4 * 8 = 32bit). Therefore, in Windows, regardless of the number of bits of OS, python, and numpy. It seems that np.int_ is int32 by default.

Reference: "C language basic type size"

The conclusion was "long type specification in C language" </ font>.

Recommended Posts