[PYTHON] I also wanted to check type hints with numpy

Overview

Type hints added in PEP484 since Python 3.5. I wondered if it could be applied to the ndarray of numpy, so it is a type hint static check tool mypy ) And how to deal with third-party modules.

Conclusion

In conclusion, type hint checking of numpy.ndarray using mypy is possible by using numpy-stubs. However, as of now (January 2020), mypy check by specifying dtype and shape of ndarray is not possible. On the other hand, if you want to add type hints including dtype and shape as annotations for readability, the option of using nptyping seems to be good.

Preparation

At a minimum, install the following with pip. (The environment in parentheses is the environment at the time of my verification)

Type checking with mypy

First of all, I tried experimenting because the type check was in a state of wonder.

# ex1.py

from typing import List, Tuple

def calc_center(points: List[Tuple[int, int]]) -> Tuple[float, float]:
    '''Find the center of gravity from the list of points'''
    n = len(points)
    x, y = 0, 0
    for p in points:
        x += p[0]
        y += p[1]
    return x/n, y/n

points_invalid = [[1, 1], [4, 2], [3, 6], [-1, 3]]

print(calc_center(points_invalid))  # TypeHint Error

The above code, of course, ends normally, but I'll check the type hint with mypy. If you have mypy installed with pip etc., you should be able to run mypy in your terminal.

>mypy ex1.py
ex1.py:16: error: Argument 1 to "calc_center" has incompatible type "List[List[int]]"; expected "List[Tuple[int, int]]"
Found 1 error in 1 file (checked 1 source file)
>python ex1.py
(1.75, 3.0)

In this way, it points out where the type hint violates. If you modify it as follows, the check by mypy will pass.

# ex2.py

from typing import List, Tuple

def calc_center(points: List[Tuple[int, int]]) -> Tuple[float, float]:
    '''Find the center of gravity from the list of points'''
    n = len(points)
    x, y = 0, 0
    for p in points:
        x += p[0]
        y += p[1]
    return x/n, y/n

points = [(1, 1), (4, 2), (3, 6), (-1, 3)]

print(calc_center(points))  # Success
>mypy ex2.py
Success: no issues found in 1 source file

Type checking for third party modules

Since numpy's ndarray is convenient for calculating coordinate points, I would like to change it accordingly. However, when I add ʻimport numpy` to the previous code and run mypy, the following error occurs.

# ex3.py

from typing import List, Tuple
import numpy as np

def calc_center(points: List[Tuple[int, int]]) -> Tuple[float, float]:
    '''Find the center of gravity from the list of points'''
    n = len(points)
    x, y = 0, 0
    for p in points:
        x += p[0]
        y += p[1]
    return x/n, y/n

points = [(1, 1), (4, 2), (3, 6), (-1, 3)]

print(calc_center(points))  # Success
>mypy ex3.py
ex3.py:4: error: No library stub file for module 'numpy'
ex3.py:4: note: (Stub files are from https://github.com/python/typeshed)
Found 1 error in 1 file (checked 1 source file)

The cause of the error is that the numpy package itself does not support type hints. Then, let's divide the countermeasures into the following three cases.

Method 1. Ignore third-party type hints

This seems to be the easiest and most common. The method is to create a file with the name mypy.ini, write as follows, and then put it in the current directory.

[mypy]

[mypy-numpy]
ignore_missing_imports = True

Lines 3 and 4 are set to ignore type hint check errors for numpy. If you want to apply it to other third party modules, copy the 3rd and 4th lines and change the numpy part. For other specifications related to mypy.ini, refer to the official here page.

Now you can run mypy check normally. However, note that the type hint check of ndarray itself is also ignored (last line).

# ex4.py (ignore_missing_imports)

from typing import List, Tuple
import numpy as np

def calc_center(points: List[Tuple[int, int]]) -> Tuple[float, float]:
    '''Find the center of gravity from the list of points'''
    n = len(points)
    x, y = 0, 0
    for p in points:
        x += p[0]
        y += p[1]
    return x/n, y/n

def calc_center_np(points: np.ndarray) -> np.ndarray:
    '''Find the center of gravity from the list of points(ndarray version)'''
    return np.average(points, axis=0)

points = [(1, 1), (4, 2), (3, 6), (-1, 3)]

print(calc_center(points))  # Success

np_points = np.array(points, dtype=np.int)

print(calc_center_np(np_points))  # Success
print(calc_center_np(points))  # Success ?
>mypy ex4.py
Success: no issues found in 1 source file

Method 2. Create a stub for type hints

Create an empty function (stub) for the type hint of the module you want to use and mypy will look at them instead. Stub files are managed with a .pyi extension.

A stub for numpy numpy-stubs is available on github.

First, bring the "numpy-stubs" folder with git clone https://github.com/numpy/numpy-stubs.git etc. Change the "numpy-stubs" folder to "numpy".

The folder structure is as follows.

numpy-stubs/
└── numpy
    ├── __init__.pyi
    └── core
        ├── numeric.pyi
        ├── numerictypes.pyi
        ├── _internal.pyi
        └── __init__.pyi

In addition, add the root folder path where the stub is placed to the environment variable MYPYPATH and execute it.

# ex5.py (numpy-stubs)

from typing import List, Tuple
import numpy as np

def calc_center(points: List[Tuple[int, int]]) -> Tuple[float, float]:
    '''Find the center of gravity from the list of points'''
    n = len(points)
    x, y = 0, 0
    for p in points:
        x += p[0]
        y += p[1]
    return x/n, y/n

def calc_center_np(points: np.ndarray) -> np.ndarray:
    '''Find the center of gravity from the list of points(ndarray version)'''
    return np.average(points, axis=0)

points = [(1, 1), (4, 2), (3, 6), (-1, 3)]

print(calc_center(points))  # Success

np_points = np.array(points, dtype=np.int)
np_points_float = np.array(points, dtype=np.float)

print(calc_center_np(np_points))  # Success
print(calc_center_np(np_points_float))  # Success
print(calc_center_np(points))  # TypeHint Error
>set "MYPYPATH=numpy-stubs"
>mypy ex5.py
ex5.py:28: error: Argument 1 to "calc_center_np" has incompatible type "List[Tuple[int, int]]"; expected "ndarray"
Found 1 error in 1 file (checked 1 source file)

Now the type hint check of ndarray itself works. However, it is not possible to check after specifying dtype and shape, and it is a slight bottleneck that environment variables must be set one by one.

stubgen

mypy comes with a script called stubgen that automatically generates a file for type hints (.pyi extension). ..

>stubgen -p numpy

-p is an option to recursively generate stubs for packages. When executed, a ʻout` folder is created in the current directory, and the numpy stub file is packed in it.

However, I get another error when I run mypy check, probably because stubgen is not able to extract the structure of numpy well. There are cases where stubs are open to the public like numpy-stubs, so it is safer to use them if possible.

Method 3. Also use nptyping

If you want to make a type hint including dtype and shape of ndarray after taking either method 1 or method 2, [nptyping](numpy-type-hints-in-python-pep- 484) should be used.

You can install it from PyPi with pip install nptyping.

Although nptyping does not support static type hint checking by mypy, type hints that specify dtype and shape of ndarray can be specified using the alias ʻArray`.

Below is an official sample. Arrays with mixed types such as DataFrame of pandas are also OK.

from nptyping import Array
Array[str, 3, 2]    # 3 rows and 2 columns
Array[str, 3]       # 3 rows and an undefined number of columns
Array[str, 3, ...]  # 3 rows and an undefined number of columns
Array[str, ..., 2]  # an undefined number of rows and 2 columns
Array[int, float, str]       # int, float and str on columns 1, 2 and 3 resp.
Array[int, float, str, ...]  # int, float and str on columns 1, 2 and 3 resp.
Array[int, float, str, 3]    # int, float and str on columns 1, 2 and 3 resp. and with 3 rows

Instance check using ʻis instance` is also possible.

# ex6.py (nptyping)

from typing import List, Tuple
import numpy as np
from nptyping import Array

def calc_center(points: List[Tuple[int, int]]) -> Tuple[float, float]:
    '''Find the center of gravity from the list of points'''
    n = len(points)
    x, y = 0, 0
    for p in points:
        x += p[0]
        y += p[1]
    return x/n, y/n

def calc_center_np(points: Array[int, ..., 2]) -> Array[float, 2]:
    '''Find the center of gravity from the list of points(ndarray version)'''
    print(isinstance(points, Array[int, ..., 2]))
    return np.average(points, axis=0)

points = [(1, 1), (4, 2), (3, 6), (-1, 3)]

np_points = np.array(points, dtype=np.int)
np_points_float = np.array(points, dtype=np.float)

print(isinstance(calc_center_np(np_points), Array[float, 2]))  #argument: True,Return value: True
print(isinstance(calc_center_np(np_points_float), Array[float, 2]))  #argument: False,Return value: True
print(isinstance(calc_center_np(points), Array[float, 2]))  #argument: False,Return value: True

Don't forget to set nptyping to ʻignore_missing_imports = True` in mypy.ini. The execution result is as follows.

>mypy ex6.py
Success: no issues found in 1 source file
>python ex6.py
True
True
False
True
False
True

Summary

I've summarized the type hints around numpy. I think it is common to treat information such as coordinates and table data as an ndarray and implement geometric and statistical operations. At that time, I often worry about the code I wrote, such as "How many dimensions of the ndarray are you kneading?" I found type hints like nptyping useful because of their readability and maintainability. I think that it will be more useful if it can support type checking by mypy in the future.

Reference article

https://stackoverflow.com/questions/52839427/ https://www.sambaiz.net/article/188/ https://masahito.hatenablog.com/entry/2017/01/08/113343

Recommended Posts

I also wanted to check type hints with numpy
I wanted to solve ABC160 with Python
I wanted to solve ABC172 with Python
I really wanted to copy with selenium
I wanted to solve NOMURA Contest 2020 with Python
I wanted to play with the Bezier curve
I wanted to install Python 3.4.3 with Homebrew + pyenv
I want to write an element to a file with numpy and check it.
I wrote GP with numpy
I wanted to solve the Panasonic Programming Contest 2020 with Python
I tried to implement breakout (deception avoidance type) with Quantx
I want to change the Japanese flag to the Palau flag with Numpy
I captured the Touhou Project with Deep Learning ... I wanted to.
I wanted to calculate an array with Sympy's subs method
I wanted to delete multiple objects in s3 with boto3
Practice! !! Introduction to Python (Type Hints)
I wanted to create a smart presentation with Jupyter Notebook + nbpresent
Hash chain I wanted to avoid (2)
numpy: I want to convert a single type ndarray to a structured array
I wanted to evolve cGAN to ACGAN
Increase source visibility with type hints
I wanted to solve the ABC164 A ~ D problem with Python
I want to do ○○ with Pandas
I want to check the position of my face with OpenCV!
I want to debug with Python
Hash chain I wanted to avoid (1)
That's why I quit pandas [Three ways to groupby.mean () with just NumPy]
I tried to implement deep learning that is not deep with only NumPy
I wanted to use jupyter notebook with docker in pip environment (opticspy)
Use Python from Java with Jython. I was also addicted to it.
It's more recent, but I wanted to do BMI calculation with python.
I started machine learning with Python (I also started posting to Qiita) Data preparation
I want to detect objects with OpenCV
[Python] Easy argument type check with dataclass
I tried to implement Autoencoder with TensorFlow
I tried to visualize AutoEncoder with TensorFlow
I tried to get started with Hy
I want to blog with Jupyter Notebook
I want to pip install with PythonAnywhere
I want to analyze logs with Python
I want to play with aws with python
[Introduction to Pytorch] I played with sinGAN ♬
I wanted to solve ABC159 in Python
I tried to implement CVAE with PyTorch
I made a life game with Numpy
I tried to solve TSP with QAOA
Implemented DQN in TensorFlow (I wanted to ...)
I wanted to visualize 3D particle simulation with the Python visualization library Matplotlib.
I want to use mkl with numpy and scipy under pyenv + poetry environment
I tried to create an environment to check regularly using Selenium with AWS Fargate
Want to add type hints to your Python decorator?
I tried to detect Mario with pytorch + yolov3
I tried to implement reading Dataset with PyTorch
I tried to use lightGBM, xgboost with Boruta
I want to use MATLAB feval with python
Convert data with shape (number of data, 1) to (number of data,) with numpy.
I tried to learn logical operations with TF Learn
I tried to move GAN (mnist) with keras
I want to analyze songs with Spotify API 2
i-Town Page Scraping: I Wanted To Replace Wise-kun
Add rows to an empty array with numpy