[PYTHON] mypy will do it

This is the article on the 21st day of KLab Advent Calendar 2016.

I feel like doing mypy, so I will write a story that I tried with the in-house code.

Sequel

I did mypy

What is mypy

Starting with python3.5, the syntax for annotating types (PEP 484) has been added. Mypy statically analyzes the type in the code based on this annotation.

Example

def func() -> int:
    return "a"

When I run mypy on code like this

$ mypy test.py
test.py: note: In function "func":
test.py:2: error: Incompatible return value type (got "str", expected "int")

The type is checked based on the annotation, and an error occurs because str is returned even though the return type is int.

Feelings to go

"Go with mypy" means to annotate the type corresponding to PEP 484, execute mypy, and pass the static check. The reason why I felt like doing it was [Static types in Python, oh my (py)!](Http://blog.zulip.org/2016/10/13/static-types-in-python-oh -mypy /) I have read this article a lot. I felt like I could do it. Since the project I was in charge of was large and the existing functions were changed frequently, I thought that having a reliable type annotation would help me to fix it. Another reason is that PyCharm now supports PEP 484 and issues warnings, which causes errors in places with incorrect Type Hinting and makes them noticeable.

start line

The code base was written in python3.5 and was developed with loose rules such as "annotation is added by the person who wants it". Annotations are attached to about 70% of methods excluding batches and tests. For the time being mypy -s --fast-parser --strict-optional --disallow-untyped-defs --disallow-untyped-calls <dir> When I ran, I got nearly 3000 errors. Since the work of crushing in order from the top as it is may be painful, I decided to limit it to the part where the business logic is clogged and execute it with only the minimum options to correct it.

get along

The version of mypy is 0.4.6 and python is 3.5.2. mypy -s --fast-parser <dir> I fixed it while running. The -s option is an option that does not check the imported module, otherwise it will go to check the imported third party library. Module import in the directory specified by <dir> will do a good job. The type of the argument and return value when calling the function of the external module is ʻAny. The --fast-parseroption is the default option, and with the current default parser, there is a pattern that causes a parse error around the unpacking of arguments. Only--fast-parser` supports the syntax of python3.6.

Where I stumbled after trying

I forgot to annotate __init__

There were so many places in the class that I didn't annotate the __init__ methods. In the first place, I wasn't conscious of annotating __init__.

Note that the return type of init ought to be annotated with -> None .

PEP484#the-meaning-of-annotations So let's set the return type to None. Fortunately, this pattern can be handled mechanically, so I wrote an appropriate script to handle it.

imported but unused

There is a static analysis tool of python called flake8, which is used to check on the CI server. Some flake8 test cases check for unused import statements, and if you import them but have never used them, you will get an error. The problem here is when you want to annotate variables instead of functions. In python3.5, when annotating a variable, it is supposed to correspond with a comment. (python3.6 adds a new variable annotation syntax. PEP526)

from typing import Optional
a = None  # type: Optional[int]

I write variable annotation like this, but since it is a comment, when I execute flake8

$ flake8 ./test.py
./test.py:1:1: F401 'Optional' imported but unused

I get an error. So, I added # noqa and a comment to the import part of the module where this error occurred to suppress the error.

from typing import Optional  # noqa
a = None  # type: Optional[int]

From python3.6

a: Optional[int] = None

Because it can be written, the problem of unused import by variable annotation may be less likely to occur.

Circular import

If you annotate module variables and methods, there is a pattern of mutual import between modules. If this happens, a loop will occur at runtime and an error will occur, so countermeasures are required. Starting with python3.5.2, a variable called typing.TYPE_CHECKING is available, which is made by a third party. A variable that is True only when the tool reads it, which can be used to avoid it. Avoid using this variable by loading the target module only at the time of analysis on the import side. You can also find it as a solution in the official mypy documentation. http://mypy.readthedocs.io/en/latest/common_issues.html#import-cycles

from typing import TYPE_CHECKING

if TYPE_CHECKING:
    from module import MyClass  # noqa


def func() -> "MyClass":
    return MyClass()

Position of # type: ignore

I could suppress mypy's test by adding the comment # type: ignore, but this position was the songwriter. If there is an error in an expression that spans multiple lines, you need to comment on the first line of the expression to suppress it. Also, the comments added here will affect the subsequent lines.

class MyClass:
    def __init__(self, a: int, b: str) -> None:
        self.a = a
        self.b = b

MyClass(1,  # type: ignore
        1)  #If you want to ignore the error on this line, you need to comment on ↑

https://github.com/python/mypy/issues/1032 It is listed in the Issue, and one day you may be able to ʻignore` line by line.

Typedshed mypy can read not only annotations in the source but also stub files that define types, and standard libraries etc. use that method. Their stubs are managed by typeshed and are installed as included in mypy. So, during mypy test, it detects illegal calls to built-in functions and standard libraries, but the stub itself is generated from python code and there are still cases where there are minor discrepancies. There are some patterns that are generated from a specific version of the source and cannot follow the update, so I think it's a good idea to throw a PR as soon as you find it.

Unsolved issues

SqlAlchemy It's difficult to annotate if you're dynamically attributed or using a crunchy metaclass. In the case of SqlAlchemy, it is difficult to annotate the source directly because __init__ in the base class that defines the table receives the keyword argument and setattr. You can define __init__ in the derived class, but I hesitate to write __init__ in each table for type annotation. The part I was using was organized into subpackages on the code, so I completely excluded it from the target.

Try it

I've reached the point where the error disappears even if I run mypy -s --fast-parser <dir>. There were few patterns that couldn't be helped, and I was able to fix them obediently. There was a bug of Typeshed, but I just added ʻignore, so the error did not disappear and I could not proceed. I was also able to find some bugs. I wrote the test, but I found it in a place that the test could not cover, such as an abnormal system. The correction work was rather difficult because I couldn't determine the type that was being returned unless I read the code properly, so I took measures little by little every day. There is no choice but to go steadily. Mypy itself is still under development, but I don't think there is anything we can't do at the moment. It's clear that it's helpful in reading the code, especially the annotations on variables that you don't know what to put in, such as ʻa = [] . It's also nice to be able to find out where you're obviously making a bad call before testing. I'm not sure if it's useful for refactoring, but I think I'll be careful when writing it because it makes it easier to understand functions that return overly complex types.

What to do in the future

--Push into CI I haven't checked it on the CI server yet, so I'll let the members know. --Enter --strict-optional Strictly check optional types. [experimental-strict-optional-type-and-none-checking](http://mypy.readthedocs.io/en/latest/kinds_of_types.html?highlight=strict#experimental-strict-optional-type-and-none- checking) --Enter --disallow-untyped-defs Make a function without type annotation an error --Enter --disallow untyped calls Make a function call without type annotation an error

With that feeling, I think it would be great if we could gradually make it a stricter option while plunging into CI.

Things to read as you go

** mypy will do it !! **

Recommended Posts

mypy will do it
I want to do it with Python lambda Django, but I will stop
I researched Docker, so I will summarize it