Type annotations for Python2 in stub files!

def main(args: Any) -> None: The appeal of Python is that you can call it without specifying the type. Thanks to that, even a small script written in a few minutes can do most of the work. But because it's too dynamic, it can backfire as it grows in scale. Since the amount of information that the human brain can hold at one time is known, there can be bugs caused by type mismatches somewhere. Professional programmers who were disgusted with this thought. Then, Python should also be type-aware.

Thanks to that, from Python 3.5, it is possible to specify the type in the argument and return value of the function. As one of the options, it is left to the discretion of the user to feel that he / she can do it if he / she wants to do it.

def fizzbuzz(num: int) -> None:
    [print("fizzbuzz") if not x % 15
    else print("fizz") if not x % 5
    else print("buzz") if not x % 3
    else print(x) for x in range(num)]

Python3.6 will be available by the end of this year (2016). There, you can develop this further and specify the type for variables as well.

from typing import List, Dict
primes: List[int] = []

captain: str  # Note: no initial value!

class Starship:
   stats: Dict[str, int] = {}

From [What ’s New In Python 3.6](https://docs.python.org/3.6/whatsnew/3.6.html “What ’s New In Python 3.6 — Python 3.6.0b4 documentation”)

But of course, if you write the type in the body of the code, this code will not work in the 2nd system. Great alternatives are available to work with either 3 or 2. It is different from the docstring written in the comment. Only the type information is cut out to the outside.

Stub was born

In other words, create a file that contains only the type information, and based on that, perform the type check of the previous code. This is called a stub file, which was proposed in PEP 484. Inside is the skeleton of the Python code you want to check. If you pass this to a tool that has a type check function, it will point out the source of the defect. mypy (How to make in mypy: [Creating Stubs For Python Modules](https: / It can be used with /github.com/python/mypy/wiki/Creating-Stubs-For-Python-Modules "Creating Stubs For Python Modules")) and PyCharm. The way to write the file itself is the same for both.

The target is

For those who say, "I know the structure of my module, but I want to make code completion smarter when using the library" [typeshed: Collection of library stubs for Python, with static types](https: // Go to github.com/python/typeshed "python / typesshed: Collection of library stubs for Python, with static types"). Volunteers have created stub files for standard modules and external libraries. The point of applying it to scripts that are not libraries is that they can be executed regardless of the Python version. Also, I think it is an advantage that it is easier to develop because the accuracy of IDE complementation is improved.

What is it like

Let's actually make it. The extension of the stub file is .pyi. Place it in the same directory as the code you want to inspect. If you move the type information to the stub, you can delete the one attached to the code body (if any). By the way, PyCharm does not have .pyi in the list of new creations. There is no such thing, but if you create it manually, it will be automatically recognized and you will be able to refer to it from now on. The type inference priority seems to be docstring <direct writing in code <stub.


Before Suppose you have a code like this: It may be unnatural code because I made it with as many elements as possible for explanation, but it works for the time being.

First of all, ordinary code that is everywhere without any type information.

import json
import logging
import requests
import sys


class Resources:
    POST_URL = "https://httpbin.org/post"

    def __init__(self, auth, logger):
        self.auth = auth
        self.logger = logger
        self.session = self.get_session()
        self.res = self.get_resources()

    def get_session(self):
        return requests.session()

    def get_resources(self):
        return json.loads(self.session.post(
            self.POST_URL, params=self.auth).text)

    def get_infos(self, queue):
        if isinstance(queue, str):
            return str(self.res.get(queue, ""))
        else:
            return {key: self.res.get(key, "") for key in queue}

class FooLogger(logging.Logger):
    def __init__(self):
        super(FooLogger, self).__init__("foobar", logging.INFO)
        self.logger = logging.getLogger()

        log_stdout = logging.StreamHandler(sys.stdout)
        self.addHandler(log_stdout)


r = Resources({"name": "watashi", u"String": u"Mojimoji"}, FooLogger())
print(r.get_infos(["args", "origin"]))
print(r.get_infos("origin"))

If you add type information to this, it will look like this. At this point, it was exclusively for 3 series.

from typing import List, TypeVar, Union, Dict, Text
import json
import logging
import requests
import sys

DatabaseType = Dict[Text, Union[int, Text, Dict[Text, Text], None]]
LoggerType = TypeVar("LoggerType", bound=logging.Logger)


class Resources:
    POST_URL = "https://httpbin.org/post"

    def __init__(self, auth: Dict[Text, Text], logger: LoggerType) -> None:
        self.auth = auth
        self.logger = logger
        self.session = self.get_session()
        self.res = self.get_resources()

    def get_session(self) -> requests.Session:
        return requests.session()

    def get_resources(self) -> Dict:
        return json.loads(self.session.post(
            self.POST_URL, params=self.auth).text)

    def get_infos(self, queue: Union[List[Text], Text]) ->\
            Union[DatabaseType, Text]:
        if isinstance(queue, Text):
            return str(self.res.get(queue, ""))
        else:
            return {key: self.res.get(key, "") for key in queue}

class FooLogger(logging.Logger):
    def __init__(self) -> None:
        super().__init__("foobar", logging.INFO)
        self.logger = logging.getLogger()

        log_stdout = logging.StreamHandler(sys.stdout)
        self.addHandler(log_stdout)


r = Resources({"name": "watashi", "String": "Mojimoji"}, FooLogger())
print(r.get_infos(["args", "origin"]))
print(r.get_infos("origin"))

After

The contents of the stub file for this are as follows. The notation # type: is used to specify the type of the variable.

from typing import List, TypeVar, Union, Dict, Text, overload
import logging
import requests


#alias
DatabaseType = Dict[Text , Union[int, Text , Dict[Text , Text], None]]

#Generics
LoggerType = TypeVar("LoggerType", bound=logging.Logger)


class Resources:
    POST_URL = ...  # type: Text

    def __init__(self, auth: Dict[Text , Text], logger: LoggerType) -> None:
        self.auth = ...  # type: Dict[Text , Text]
        self.logger = ...  # type: LoggerType
        self.session = ... # type: requests.Session
        self.res = ...  # type: Dict

    def get_session(self) -> requests.Session: ...

    def get_resources(self) -> Dict: ...

    @overload
    def get_infos(self, queue: Text) -> Text: ...
    @overload
    def get_infos(self, queue: List[Text]) -> DatabaseType: ...


class FooLogger(logging.Logger):
    def __init__(self) -> None:
        super().__init__(self, ...)
        self.logger = ...  # type: LoggerType

Description

First of all, the stub gives an overview of the code, so you don't have to write an implementation. The main body of the process is omitted with "...". Also, there are some arguments and constants that have initial values set, but they are all "...". So this is a Python-like alternative and cannot be executed.

@overload There is actually only one element unique to stubs. Others are the same as how to use the typing module itself. One of them is this. In the code above, get_infos () returns a dictionary given a list and a string given a string. Like the body, def get_infos(self, queue: Union[List[str], str]) -> Union[DatabaseType, str]: If you write, you cannot distinguish between list → list and list → character string. This is where overloading comes into play. You can clarify the combination of argument and return type.

String

Considering the correspondence of 2 systems, it is unnatural if the character string type is str. Text behaves as a" string ", str in the 3rd series, and ʻunicodein the 2nd series. If you want to includebytes, ʻAnyStr is available.

Numbers

It seems that there are no generics for both ʻintandfloat. So T = TypeVar('T', int, float)` Is it safe to say?

List, Tuple, Dict, etc.

It corresponds to the usual list, tuple, and dict. I'm importing from the typing module, but it's okay to use the lowercase one instead. Because when you look at the implementation

class List(list, MutableSequence[T], extra=list):

    def __new__(cls, *args, **kwds):
        if _geqv(cls, List):
            raise TypeError("Type List cannot be instantiated; "
                            "use list() instead")
        return list.__new__(cls, *args, **kwds)

Because it is, it seems that it is not much different from just list (). By the way, Dict [str, str] means "key type is str, value type is str". Dict [str] is invalid.

ʻUnion and ʻOptional

Union is literally used to describe something that is a combination of something. ʻUnion [int, str]means that both int and str can be received (or returned). Optional represents one of the elements of the Union isNone. In other words Optional[str] == Union[str, None]` is. It is (maybe) realizes null safety, which is a hot topic these days.

alias

get_infos () returns a dictionary. Of course, it's enough to just write Dict, but what if you define it in detail? Dict[str, Dict[str, Union[int, str, List[str]]]] It's a hassle to write such a long thing over and over again. Copying is a source of bugs. Let's round it into a variable. This is called an alias.

Generics

FooLogger is a class for logging, but for the main process it doesn't matter what its specific name is, it just matters if it inherits from logging. I use this at such times. Here's how to write Python-style generics: T = TypeVar ("T", bound = logging.Logger) In this case T is a subclass of logging. In general, it is more common to write T = TypeVar ("T ") without specifying bound.

By the way, LoggerType = TypeVar ("T ") is useless. The variable name and the type name in the string must match.

Recommended Posts

Type annotations for Python2 in stub files!
Sample for handling eml files in Python
Search for strings in Python
Search for strings in files
Techniques for sorting in Python
Recursively search for files and directories in Python and output
Whole type conversion for each dataframe column in python
About "for _ in range ():" in python
Transpose CSV files in Python Part 1
Check for memory leaks in Python
Check for external commands in python
Function argument type definition in python
Python for super beginners Python # dictionary type 1 for super beginners
Manipulate files and folders in Python
Dynamically load json type in python
Handling of JSON files in Python
Download Google Drive files in Python
Type specified in python. Throw exceptions
Sort large text files in Python
Read files in parallel with Python
Export and output files in Python
Run unittests in Python (for beginners)
Python for super beginners Python # dictionary type 2 for super beginners
Extract strings from files in Python
Find files like find on linux in Python
Output tree structure of files in Python
Inject is recommended for DDD in Python
Tips for dealing with binaries in Python
Summary of various for statements in Python
Referencing INI files in Python or Ruby
Template for writing batch scripts in python
Automate jobs by manipulating files in Python
Process multiple lists with for in Python
MongoDB for the first time in Python
Read and write JSON files in Python
Get a token for conoha in python
AtCoder cheat sheet in python (for myself)
I searched for prime numbers in python
Notes for using python (pydev) in eclipse
Tips for making small tools in python
Use pathlib in Maya (Python 2.7) for upcoming Python 3.7
Download files in any format using Python
2016-10-30 else for Python3> for:
Template for creating command line applications in Python
Quadtree in Python --2
Python in optimization
python [for myself]
CURL in python
CERTIFICATE_VERIFY_FAILED in Python 3.6, the official installer for macOS
Metaprogramming in Python
++ and-cannot be used for increment / decrement in python
Python 3.3 in Anaconda
Geocoding in python
SendKeys in Python
Convert FBX files to ASCII <-> BINARY in Python
Python numeric type
Meta-analysis in Python
Unittest in python
Import-linter was useful for layered architecture in Python
Summary of how to import files in Python 3
Epoch in Python