Modern Python for intermediate users

Introduction

For readers of this article

--Those who want to learn more about Python development environment and tools --For those who want a more modern Python environment

Those who do not anticipate

--For those who are new to Python itself --Python advanced

What to explain / not to explain

explain

--A rough description of the tool ――Reason and happiness to use the tool --Reference document / URL

Do not explain

--Specific command --Detailed grammar

Modern Python

Since I started doing research in graduate school, I've been writing Python for quite some time. This is because there are many libraries that are easy to use for research, and iterations such as research are very effective for fast projects.

However, Python is difficult to operate stably because you can try the code in a short period of time and change the behavior. For example, C ++ needs to be compiled, so you need to think carefully about your design and implement it. On the other hand, Python is a scripting language, so if you do the worst dark design, even if the design is appropriate, it will be manageable. However, such unthinking code will be a big debt in the future. In fact, the code I wrote at a rapid pace for a treatise with a near deadline now stands in front of me, and I have to redesign and implement it from scratch.

Python, for better or for worse, has the following characteristics:

--No need to set the type ――By that amount, you will think about the variable name. --Many modules --Version shift --Abundant package managers ――It's hard to know which one to use --The design is sweet ――It becomes hard to read

This time, I'll summarize the tools that may solve these problems. However, since the actual usage and details are omitted, please refer to other articles.

I want to get used to Python and become an intermediate person.

Package manager

There are many types of Python package managers. Earlier, I explained how to build a Python environment in the article I can't seem to put an end to the python environment construction war with illustrations. From that point on, my understanding progressed to some extent, and I felt that it would be okay to build the current Python environment, so I would like to summarize it.

Pattern 1: pyenv + pipenv

pyenv can ** manage multiple Python versions **. Specifically, you can install Python3.7, Python3.8, and Python3.9 respectively and switch to your liking. By switching the Python version in this way, you can support multiple projects. For example, the code for an old project only works with Python 2.7! There is also something like. Therefore, it would be nice to be able to switch versions with pyenv.

However, although pyenv can switch versions, it cannot create virtual environments. This virtual environment refers to the environment used for each project in this article.

Imagine if you can't isolate the virtual environment. Suppose you've developed a machine learning project and then assigned to another project to develop Django. At this time, the machine learning pytorch library isn't needed directly in Django. As the number of libraries increases, the operation becomes slower and the required version becomes inconsistent. Therefore, we need a virtual environment where libraries can be installed for each project.

To solve this problem, pipenv can create multiple virtual environments in a version of Python. It is recommended because pipenv has many useful functions such as wrapping venv. It also makes library versioning smarter, though I won't go into details here.

In summary, pyenv manages the version of Python itself, and pipenv manages the virtual environment for a particular Python version. On Mac, you can install both pipenv and pyenv with brew.

Pattern 2: pyenv + poetry

pyenv came out earlier. This time, the new one is poetry.

poetry seems to be a manager who can manage libraries like rust. I'm not familiar with it because I haven't touched it yet, but it seems that one toml file manages everything.

It looks like a pretty new manager, so I'd love to touch it next time.

Reference URL

Introduction of mold

Python is a scripting language that works without typing. As a result, you can develop at a fairly high speed in the early stages, but in the second half, you spend more time thinking about variables and often suffer from run-time errors.

Therefore, you can code safely using typing etc. introduced from Python 3.5. However, it should be noted that ** when the program is executed, no error will be issued even if the variable contains contents of different types **.

It doesn't stop even if different types are included at runtime. However, it will be easier for you to benefit from the IDE and for third parties to understand the meaning of the code, so let's write it aggressively.

typing

def greeting(name: str) -> str:
    return 'Hello ' + name

From Python3.5, you can specify types for variables and functions as described above. Specifying the type makes it easier to benefit from the IDE. Also, in the long run, it will lead to more efficient coding.

There is also a Final keyword, which allows you to set constants etc. more safely.

Please see the reference URL for details.

data classes

It provides decorators and the like to easily create classes that store data.

from dataclasses import dataclass

@dataclass
class InventoryItem:
    """Class for keeping track of an item in inventory."""
    name: str
    unit_price: float
    quantity_on_hand: int = 0

    def total_cost(self) -> float:
        return self.unit_price * self.quantity_on_hand

namedtuple

You can declare a named tuple. Since you can guarantee data that cannot be rewritten, it seems to be convenient for managing things that do not change.

If possible, it is easier to declare tuples in a class that inherits typing.NamedTuple.

class Employee(NamedTuple):
    name: str
    id: int

Generics

pydantic

This library is used by FastAPI. It provides type information at runtime.

from datetime import datetime
from typing import List, Optional
from pydantic import BaseModel


class User(BaseModel):
    id: int
    name = 'John Doe'
    signup_ts: Optional[datetime] = None
    friends: List[int] = []
    
external_data = {
    'id': '123',
    'signup_ts': '2019-06-01 12:22',
    'friends': [1, 2, '3'],
}
user = User(**external_data)

If it contains untyped data, it will throw an exception ** even at runtime. (On the other hand, typing etc. does not throw a run-time error, so it is easier to see what is wrong with the type and it is more robust.)

Reference URL

docstring

docstring is a string that represents information such as functions and classes.

Describe what kind of arguments / attributes the function or class has and how it behaves as a character string.

def add(x:int, y:int) -> int:
    """add function.

Calculate the sum of x and y.

    Attributes
    ----------
    x: int
Number to be added
    y: int
Arguments to add

    Returns
    -------
    int

    Notes
    -----
I usually don't write this much in these functions (because I know)
    """
    return x + y

Benefits of docstring

Writing a docstring has the following advantages.

—— Can convey the behavior of functions and classes to other team members --The distance between the document and the code is close --Types and supplementary information become more detailed in IDE etc. --As will be described later, you can use docstring with the automatic document creation tool called Sphinx. —— Above all, it will be good for you in the future

Style type

There are several ways to write a docstring. The main ones are the following three.

Each has its own writing style. It seems good to refer to your favorite writing style. Also, if your team has decided how to write a docstring, follow that style.

There are few references on how to write docstring, so it seems good to refer to some library.

Reference URL

-NumPy style Python Docstrings example

Style guide

When multiple people write code, they have different habits of writing code. There are various things such as " or', the number of characters, and how to add variables.

There are code check, Lint, formatter, etc. to unify different code styles. You can use these to write more unified Python code.

In this section, I will only touch on the ones I use. Therefore, there are other Linters besides these. Please check it out.

pep

pep stands for python enhancement proposal and refers to the documentation coding convention. A proposal to improve Python, the most famous is pep8.

pep8 is a coding standard such as the standard library, and most Python code is based on this pep8.

flake8

flake8 is a tool to check the format of code that can be installed with pip. It will check if you follow the coding standards.

flake8 is a wrapper for the following three libraries.

With flake8, you can set detailed rules such as the number of characters in one line. Also, if you install the following flake8 plugins with pip, when you execute the flake8 command, those plugins will be executed automatically.

black

black is a code formatter. flake8 was tell me where the convention was violated, but black actually formats the code.

The black feature is relatively new, and there are quite a few settings that can be changed. Therefore, using black will result in a similar forced format for many projects.

Black is very easy to use, so I definitely want to use it.

mypy

mypy statically analyzes the annotation / type of the code and tells you the wrong type. Thanks to mypy, all you have to do is fix the wrong type.

However, it may give an error to the library you are using, in which case you need to generate a stub or install the stub that is already distributed with pip.

isort

isort modifies the python import order. Since flake8 has an isort plugin, it seems good to do isort when you are warned that it is out of order.

Reference URL

setting file

When creating a Python project, some configuration files will come out. This section describes these configuration files.

setup.py

setup.pyis a file used to distribute the project to third parties. Use a module calledsetuptools` to create a package that allows you to install project files with pip. Describe the package information, installation method, URL, etc.

# https://packaging.python.org/tutorials/packaging-projects/?highlight=setup.py#creating-setup-py
import setuptools

with open("README.md", "r", encoding="utf-8") as fh:
    long_description = fh.read()

setuptools.setup(
    name="example-pkg-YOUR-USERNAME-HERE", # Replace with your own username
    version="0.0.1",
    author="Example Author",
    author_email="[email protected]",
    description="A small example package",
    long_description=long_description,
    long_description_content_type="text/markdown",
    url="https://github.com/pypa/sampleproject",
    packages=setuptools.find_packages(),
    classifiers=[
        "Programming Language :: Python :: 3",
        "License :: OSI Approved :: MIT License",
        "Operating System :: OS Independent",
    ],
    python_requires='>=3.6',
)

The command pip install numpy, which I usually write casually, gets the packages created by setup.py from online (PyPI) and puts them in a directory called site-packages.

If you want to publish your project online as a package, write setup.py.

MANIFEST.in

When creating a package from a project with setup.py, there is When you want to include other than python files in the package. For example, an image file or an audio file.

At this time, by creating a file called MANIFEST.in, you can more easily build by including various files in the package.

setup.cfg

setup.py is required to publish and install as a package. However, if you directly specify Author or files to include, it will be difficult to change them later.

Then, by creating an additional configuration file, setup.cfg, You can manage the information used in the package independently. If you see setup.cfg when you run setup.py, retrieve the information, overwrite the contents, and then create the package.

[metadata]
name = my_package
version = attr: src.VERSION
description = My package description
long_description = file: README.rst, CHANGELOG.rst, LICENSE.rst
keywords = one, two
license = BSD 3-Clause License
classifiers =
    Framework :: Django
    License :: OSI Approved :: BSD License
    Programming Language :: Python :: 3
    Programming Language :: Python :: 3.5

[options]
zip_safe = False
include_package_data = True
packages = find:
scripts =
    bin/first.py
    bin/second.py
install_requires =
    requests
    importlib; python_version == "2.6"

requirements.txt

A file that shows a list of packages installed with pip. It doesn't have to be named requirements.txt, but it is customarily named.

By creating this file and including it on GitHub etc. A third party can easily install the package with pip install -r requirements.txt.

However, there are problems in requirements.txt such as it is not suitable for dependency resolution and it is difficult to update the library version. Therefore, pipenv currently uses another package management file called Pipfile and poetry uses pyproject.toml.

Pipfile/Pipfile.lock

Pipfile, Pipfile.lock is a pipenv management file that solves the requirements.txt problem.

The Pipfile will contain the ** directly dependent ** libraries. For example, if you want to create a project that hits the URL using requests, create the following Pipfile.

[[source]]
url = "https://pypi.python.org/simple"
verify_ssl = true
name = "pypi"

[packages]
requests = "*"

At this time, requests depends on other libraries. For example, we internally use a library called chardet that determines the character code of a file.

Here, the version of chardet is not entered in the Pipfile. On the other hand, all versions of the library are listed in Pipfile.lock.

At this time, the top-level library you want to use directly can be managed with Pipfile, so you can easily consider upgrading the version. Also, all dependent library versions are managed by Pipfile.lock, making it easy for a third party to have the same execution environment. This solved the dependency problem by separating the files.

pyproject.toml

pyproject.toml is a file defined in PEP 518 that manages package settings. Recently, a package manager called poetry uses this file, It seems that it is not a configuration file limited to poetry. Any supported package manager can use pyproject.toml.

Previously, many files such as requirements.txt, setup.py, setup.cfg, MANIFEST.in were required to publish the package. pyproject.toml is the only file that makes up for all of this.

I'm currently using pipenv, but I'm interested in poetry and pyproject.toml, so I'll try it.

tox.ini

tox is a library that automates Python testing. By hitting the tox command, all the contents of the test written in tox.ini will be executed automatically.

If it's about one pytest, you only have to type that command every time. However, you may want to test with multiple versions, such as python2.7, python3.8, python3.9, depending on the version you are distributing. You may also want to just test for compliance with coding standards like flake8.

Therefore, by using tox and the configuration file tox.ini, all tests can be executed automatically with just one command. What's more, tox creates other versions of the Python environment inside a special directory that tox handles. As a result, a virtual environment is created for each test, and there is no dependency between tests.

 [tox]
 #Specify the environment to use
 #Flake8 with matching name-py38 is[testenv:flake8-py38]To run
 #py38 is[testenv:py38]Because there is no[testenv]To run
 envlist =
     py38
     flake8-py38
     mypy-py38
 
 [testenv]
 deps = pipenv
 #Command to run in test
 #This command just does pipenv install, so of course it passes the test
 commands =
     pipenv install
 
 [testenv:flake8-py38]
 basepython = python3.8
 description = 'check flake8-style is ok?'
 commands=
     pipenv install
     pipenv run flake8 gym_md
 
 #setting file
 # https://flake8.pycqa.org/en/latest/user/configuration.html#configuration-locations
 [flake8-py38]
 max-line-length = 88
 
 
 [testenv:mypy-py38]
 basepython = python3.8
 description = 'check my-py is ok?'
 commands =
     pipenv install
     pipenv run mypy gym_md

Reference URL

PyPI

PyPI is a site where you can upload python libraries. If you did pip install, it is downloaded from here.

test

There are multiple testing tools in python.

unittest

The standard test library is unittest. Included in the standard package, you can write tests without installing.

Inherit the TestCase class and create a method starting with test.

import unittest

class TestStringMethods(unittest.TestCase):

    def test_upper(self):
        self.assertEqual('foo'.upper(), 'FOO')

    def test_isupper(self):
        self.assertTrue('FOO'.isupper())
        self.assertFalse('Foo'.isupper())

    def test_split(self):
        s = 'hello world'
        self.assertEqual(s.split(), ['hello', 'world'])
        # check that s.split fails when the separator is not a string
        with self.assertRaises(TypeError):
            s.split(2)

if __name__ == '__main__':
    unittest.main()

pytest

pytest is a third-party test library. pytest tests based on functions and gives more detailed errors than unittest.

For example, if the output value is incorrect as shown below, the wrong location and its value will be output. I use pytest because pytest is easy to use and the output value is easy to understand.

# content of test_sample.py
def inc(x):
    return x + 1


def test_answer():
    assert inc(3) == 5
$ pytest
=========================== test session starts ============================
platform linux -- Python 3.x.y, pytest-6.x.y, py-1.x.y, pluggy-0.x.y
cachedir: $PYTHON_PREFIX/.pytest_cache
rootdir: $REGENDOC_TMPDIR
collected 1 item

test_sample.py F                                                     [100%]

================================= FAILURES =================================
_______________________________ test_answer ________________________________

    def test_answer():
>       assert inc(3) == 5
E       assert 4 == 5
E        +  where 4 = inc(3)

test_sample.py:6: AssertionError
========================= short test summary info ==========================
FAILED test_sample.py::test_answer - assert 4 == 5
============================ 1 failed in 0.12s =============================

doctest

doctest allows you to run tests on the docstring that came up earlier. When you import doctest, if the execution example is written in docstring with >>>, it will test whether it works as it is.

You can't write complex tests like pytest. However, since it is in the docstring, it can be used as a test and presented to a third party as an execution example. This makes it easier to understand how your code behaves and makes it easier to modify your code.

def square(x):
    """Return the square of x.

    >>> square(2)
    4
    >>> square(-2)
    4
    """

    return x * x

if __name__ == '__main__':
    import doctest
    doctest.testmod()

tox

As we mentioned earlier, you can automate multiple test commands by writing tox.

Reference URL

document

Sphinx is a tool that makes it easy to create beautiful documents. Many of the Python library references are written in this Sphinx.

Sphinx uses a markup language called reStructuredText to create documents. At this time, sphinx-apidoc, which is a function to automatically create a document, is attached, and if you write the docstring properly in the Python code, you can create a document with a single command. Therefore, if you write a docstring, you can leave the type, information, etc. for the future, and it will be a reference as it is. This makes the document less updated than the code, and less likely to incur the liability of the document becoming a mere ghost.

However, the bst file created by shpinx-apidoc is the default, so you will have to edit it yourself to make it fine.

Reference URL

cookiecutter

cookiecutter is a tool that makes it easy to create beautiful Python projects. The Available Templates (https://github.com/topics/cookiecutter-template) are published on GitHub, which makes it easy to create well-designed projects locally. While setting my information on the boiler plate Think of it as a tool that can be created locally.

For example, you can easily create a project with the following settings by specifying cookiecutter-pypackage.

--Sophisticated project design --Test automation with Travis CI --Document creation using Shpinx --Testing in multiple environments using tox --Automatic release to PyPI --CLI interface (click)

I'm glad that I can prepare all the items I have explained so far.

Reference URL

at the end

I've summarized the Python development tools. While summarizing, I realized that I was still not very familiar with it and could not master it.

I want to practice every day so that I can write pythonic code.

Recommended Posts

Modern Python for intermediate users
BigQuery integration for Python users
2016-10-30 else for Python3> for:
python [for myself]
[python] Get Twitter timeline for multiple users
R code compatible sheet for Python users
About Python for loops
About Python, for ~ (range)
python textbook for beginners
Refactoring tools for Python
python for android Toolchain
OpenCV for Python beginners
Install Python (for Windows)
[Python] for statement error
Python environment for projects
Python memo (for myself): Array
About Fabric's support for Python 3
Python list, for statement, dictionary
Python for Data Analysis Chapter 4
A road to intermediate Python
Learning flow for Python beginners
Python 3.6 installation procedure [for Windows]
Python learning plan for AI learning
Set Up for Mac (Python)
Search for strings in Python
Python Tkinter notes (for myself)
OpenCV3 installation for Python3 @macOS
Python code memo for yourself
[Python] xmp tag for photos
Python environment construction For Mac
Techniques for sorting in Python
pp4 (python power for anything)
Python3 environment construction (for beginners)
Roadmap for publishing Python packages
Python 3 series installation for mac
Python #function 2 for super beginners
Python template for Codeforces-manual test-
Basic Python grammar for beginners
3 months note for starting Python
Qt for Python app self-update
Python for Data Analysis Chapter 2
100 Pandas knocks for Python beginners
Checkio's recommendation for learning Python
Keyword arguments for Python functions
[For organizing] Python development environment
[Python] Sample code for Python grammar
Python for super beginners Python #functions 1
[Python / PyQ] 4. list, for statement
Simple HTTP Server for python
[Python + Selenium] Tips for scraping
Python #list for super beginners
~ Tips for beginners to Python ③ ~
Extract only Python for preprocessing
Indentation formatting for python scripts
Introduction to Python For, While
About "for _ in range ():" in python
tesseract-OCR for Python [Japanese version]
[Python] Iterative processing (for, while)
Python for Data Analysis Chapter 3
Install dlib for Python (Windows)
What is Python? What is it used for?