What is Python's __init__.py?

When you start using Python, you want to manage your files in a directory hierarchy. That's where the __init__.py file comes up.

Who the hell is this? There is a lot of information, but I can't find anything that explains it in a way that makes sense. Even in the Python documentation, it's hard to tell what to look for in the correct answer [^ Note 1].

[^ Note 1]: It seems that reading "Python Tutorial-Modules" is the correct answer (for the version you are using). See the matching documentation). As a tutorial, when I first read it, I couldn't understand it at all and jumped into my memory.

So, I summarized __init__.py. (It's a little long) Since it is written in a reading format, if you want to see only the conclusion ("[ __init__.py role](# __ init__py- role) "), scroll down and read the last part. The python code example mainly uses 3.6 / 3.5 [^ Note 2].

[^ Note 2]: In order to reduce noise, shebang such as #! / Usr / bin / env python and -*-coding: utf-8-*- etc. of character code specification are excluded. Therefore, only the core part is described simply.

  1. "Module", "Package" and "Namespace"
  2. Modules and hierarchy
  3. Single file module
  4. Directory hierarchy and namespace
  5. Directory and namespace mapping
  6. [Role of __init __. Py](Role of # __ init__py-)
  7. Marker for module search
  8. Namespace initialization
  9. Definition of wildcard import target (definition of __all__)
  10. Definition of namespaces for other modules in the same directory
  11. Summary
  12. Notes on unittest (added by Comment from @methane)

"Module", "Package" and "Namespace"

Before getting into the main subject, briefly about "modules", "packages", and "namespaces".

[^ Note 3]: "Python general terminology" says "Rewrite code in Python" The basic unit to use: That is, a set of code imported from other code. "

Glossary: module An object that serves as an organizational unit of Python code. Modules have a namespace containing arbitrary Python objects. Modules are loaded into Python by the process of importing.

See also: "Python docs-Tutorial- Modules"

[^ Note 4]: In "Python documentation-Tutorial- Packages", "Modules containing other modules" "It has been described as.

Glossary: package A Python module which can contain submodules or recursively, subpackages. Technically, a package is a Python module with an __path__ attribute.

Glossary: namespace The place where a variable is stored. Namespaces are implemented as dictionaries. There are the local, global and built-in namespaces as well as nested namespaces in objects (in methods). Namespaces support modularity by preventing naming conflicts. For instance, the functions builtins.open and os.open() are distinguished by their namespaces. Namespaces also aid readability and maintainability by making it clear which module implements a function. For instance, writing random.seed() or itertools.islice() makes it clear that those functions are implemented by the random and itertools modules, respectively.

Namespace example 1


import alpha.bravo.charlie

alpha.bravo.charlie.delta()

In this example (Example 1), the namespace is hierarchically structured as ʻalpha bravocharlie, and the procedure (function) called delta () implemented in that charlie`. Is calling.

You can intuitively understand that if the upper namespace is different, even if the lower namespace has the same name, it is a different item.

Namespace example 2


import os.path

os.path.join('/home', 'user')

Here (Example 2) is a commonly used example of joining paths, calling join () in the namespace ʻospath`.

Modules and hierarchy

Single file module

Prepare two files in the same directory.

file organization


./
├─ module01.py .....module
└─ sample0010.py ...Executable file(= Main module)

module1.py


def hello():
    print( "Hello, world!" )

sample0010.py


import module01
module01.hello()

sample0010.Run py


$ python3 sample0010.py
Hello, world!
$ python2 sample0010.py
Hello, world!

Single file modules can be ʻimport` by putting them in the same directory. If you just want to separate the files, you don't need to create a directory.

At this time, __init__.py is unnecessary. The file name is module01.py, but the module name is module01 (file name minus .py). The namespace is module01.

Hierarchical structure and namespace by directory

Consider the following directory and file structure.

Sample 2 Hierarchical structure


./
├─ sample0020.py .....Executable file
└─ dir/
    └─ module02.py ...module

dir/module02.py


def hello():
    print( "Hello, world! from module02" )

At this time, in order to call module02, in sample0020.py, it is necessary to write as follows.

sample0020.py


import dir.module02
dir.module02.hello()

Execution result


$ python3 sample0020.py
Hello, world! from module02

$ python2 sample0020.py
Traceback (most recent call last):
  File "sample0020.py", line 1, in <module>
    import dir.module02
ImportError: No module named dir.module02

It worked as expected in python3, but I got an error in python2. This is because we need __init__.py under dir.

Starting with v3.3, it can be called even if there is no __init__.py in the directory containing the module to be called, but in the" normal package ", __init__.py is to be placed.

What’s New In Python 3.3: PEP 420: Implicit Namespace Packages Native support for package directories that don’t require __init__.py marker files and can automatically span multiple path segments (inspired by various third party approaches to namespace packages, as described in PEP 420)

This is a feature for the new Namespace Packages added in V3.3, which requires __init__.py for Regular packages.

Regular packages A regular package is typically implemented as a directory containing an init.py file.

Unless you are creating a "Namespace Packages", you are required to put __init__.py in the directory where you put the module.

Now let's take a look at the meaning of __init__.py.

Directory-namespace mapping

In sample 2, calling module02 had to be referenced in the dir.module02 namespace because there was a directory called dir /. dir is in the way. Must be specified as a namespace hierarchy, even though it has no entity.

That's where __init__.py comes in. __Init__.py is a way to call it directly with the name module02 after making it a directory hierarchy.

You can name the directory module02 / instead of dir /, but the file you call must end up being module02. What .hello (). Therefore, the file __init__.py has a special meaning so that it can be treated as a module with the same namespace as the directory name.

In other words, by writing a program in module02 / __ init__.py instead of dir / module02.py, you can call it as module02.

Sample 3&nbsp;__init__.Hierarchical structure including py


./
├─ sample0030.py .....Executable file
└─ module02/
    └─ __init__.py ... "module02"Entity of

module02/__init__.py


def hello():
    print( "Hello, world! from __init__.py" )

sample0030.py


import module02
module02.hello()

Execution result


$ python2 sample0030.py
Hello, world! from __init__.py

$ python3 sample0030.py
Hello, world! from __init__.py

I don't know the historical background (I haven't investigated it), but I suspect this was the original __init__.py. (Author's speculation) There is a place to write __init__ () in the module class (namespace) that exists as a file, but there is no place to write __init__ () in the namespace that exists as a directory. It seems that it is composed of this file called __init__.py.

And since __init__.py represents the existence of the namespace, __init__.py would have been used as a marker for the module. Therefore, the implementation was that the directory containing the file to be read as a module must have __init__.py (= the directory containing __init__.py is part of the explicit namespace). I imagine it.

If you put __init__.py, it is treated as an entity module with the same namespace as the directory name. In other words, __init__.py is a file for mapping directory names as module names (or explicit" namespaces "). (Role of constructor when directory name is namespace)

If you can understand this, you may find that __init__.py, which you didn't understand until now, becomes a little more familiar.

The role of __init__.py

With that in mind, if you read the Python tutorial "Module", you'll see __init__.py I think the role of is easy to understand.

  1. __init__.py is a marker for module search.
  2. __init__.py initializes the namespace named after the directory in which it resides.
  3. __init__.py also defines the target of wildcard import in the namespace (definition of __all__).
  4. __init__.py defines namespaces for other modules in the same directory.

It can be said that 2 . To 4 . Are grouped together as "initialization of module or package", but I have divided them here.

1. Marker for module search

__init__.py is used as a marker to search for modules in the hierarchy. __Init__.py must exist in order for the directory-hierarchized modules to be loaded. (We won't cover "Namespace Packages" that don't require __init__.py)

Regular packages Python defines two types of packages, regular packages and namespace packages. Regular packages are traditional packages as they existed in Python 3.2 and earlier. A regular package is typically implemented as a directory containing an __init__.py file.

2. Namespace initialization

As we have already seen, when treating a directory name as a namespace module, register the contents to be executed first in __init__.py. Even if it is the lower module ʻimport, the lower module will be executed after the initialization as the upper namespace is performed. It should be noted that if any initialization is required before executing (loading) the lower module, it must be done before ʻimport the lower module.

3. Definition of wildcard import target (definition of __all__)

Although it is described in "Python Tutorial-Import \ * from Package" The __all__ list defines what is imported when you callfrom my_module import *.

This isn't just about __init__.py, it can be defined in all modules.

See sample 4 below. Have two Python script files in the same directory.

Sample 4


./
├─ sample0040.py ...Executable file
└─ module04.py .....module

sample0040.py


from module04 import *

hello1()
hello2()
hello3()

In sample0040.py,*is used to ʻimport, such as from module04 import * . It is a simple program that calls hello1 (), hello2 (), and hello3 () in order after ʻimport.

module04.py


__all__ = ['hello1', 'hello2']

def hello1():
    print( "Hello, this is hello1" )

def hello2():
    print( "Hello, this is hello2" )

def hello3():
    print( "Hello, this is hello3" )

In module04.py,hello1 (),hello2 (), andhello3 ()are defined. The __all__ list contains only'hello1' and 'hello2', not'hello3'.

The execution result is as follows.

Execution result


$ python sample0040.py
Hello, this is hello1
Hello, this is hello2
Traceback (most recent call last):
  File "sample0040.py", line 5, in <module>
    hello3()
NameError: name 'hello3' is not defined

The call to hello3 () was undefined and resulted in the error "NameError: name'hello3' is not defined". This is because it is not in the list of __all__. This is not that hello3 () is hidden, it is just the behavior when ʻimport *is used. As a test, you can also callhello3 () by ʻimport without using*and explicitly calling module04.

sample0041.py


import module04

module04.hello1()
module04.hello2()
module04.hello3()

Execution result


$ python sample0041.py
Hello, this is hello1
Hello, this is hello2
Hello, this is hello3

Defining __all__ in __init__.py only defines an object that can be referenced when a module whose namespace is a directory name is ʻimport with * `.

4. Definition of namespaces for other modules in the same directory

I wrote above that you can call it as a module with the same name as the directory name by defining a function etc. in __init__.py. As __init__.py grows, you'll want to put only initialization in __init__.py and get the file out.

Try it with the following directory and file structure.

Sample 5


./
├─ sample0050.py ......Executable file
└─ module05
    ├─ __init__.py .... "module05"Initialization file
    ├─ _module05.py ... "module05"Entity of
    └─ module06.py .... "module05"Additional modules

It is assumed that module05 / _module05.py was developed to be taken out or provided as module05 from the beginning because __init__.py was swollen. I added an underscore (_) to make it the same file name as the directory so that it can be seen as the actual module name.

python:./module05/_module05.py


print( "in _module05.py" )

def hello(caller=""):
    print( "Hello, world! in _module05 called by {}".format(caller) )

It is assumed that module05 / module06.py is a file that was developed outside __init__.py from the beginning.

python:./module05/module06.py


print( "in module06.py" )

def hello(caller=""):
    print( "Hello, world! in module06 called by {}".format(caller) )

Both hello () in _module05.py and hello () in module06.py pass a caller as an argument so that the caller knows.

Now, for __init__.py, since we are loading modules in the same directory, we have added a dot (.) to the head of _module05 and module06 to indicate the current directory (same namespace). Also, since the names of hello () are in conflict, we use ʻas` to rename them.

python:./module05/__init__.py


print( "in __init__.py" )

# import _module05.hello() as hello05() in the same directory
from ._module05 import hello as hello05
# import module06.hello() as hello06() in the same directory
from .module06 import hello as hello06

__all__ = ['hello05', 'hello06']

# Do initialize something bellow
hello05("__init__.py")
hello06("__init__.py")

The definition of an object that can be called * by __all__, under # Do initialize something bellow, is assumed to be doing some initialization.

The original call sample0050.py is as follows. With from module05 import *, only the module of module05 is loaded.

python:./sample0050.py


print( "in {} 1".format( __file__ ) )

from module05 import *

print( "in {} 2".format( __file__ ) )
hello05(__file__)
hello06(__file__)

The execution result is as follows.

Execution result


$ python3 sample0050.py
in sample0050.py 1
in __init__.py
in _module05.py
in module06.py
Hello, world! in _module05 called by __init__.py
Hello, world! in module06 called by __init__.py
in sample0050.py 2
Hello, world! in _module05 called by sample0050.py
Hello, world! in module06 called by sample0050.py

You can see that module05 / _module05.py and module05 / module06.py are called as module05 by the intervention of __init__.py.

By the way, module05 / module06.py is not hidden, so you can call it directly.

module05/module06.Direct call to py


$ python3 -c "import module05.module06; module05.module06.hello('shell')"
in __init__.py
in _module05.py
in module06.py
Hello, world! in _module05 called by __init__.py
Hello, world! in module06 called by __init__.py
Hello, world! in module06 called by shell

Once you understand this, I think you will be able to develop modules that can be reused as "packages."

Summary

I confirmed the role of __init__.py while verifying it.

  1. You need __init__.py to import a layered module. ("Implicit Namespace Packages" that does not install __init__.py added in v3.3 will not be mentioned here)
  2. The module initialization process is described in __init__.py.

It is often written that there is a role, but here, the second role is described in three parts.

2-1. Namespace initialization 2-2. Definition of the target of the wildcard ʻimport(definition ofall`) 2-3. Definition of namespaces for other modules in the same directory

Thirdly, it is possible to define other modules that are not in the same directory by ʻimport`, but first of all, it is a module that is in the same directory.

Regarding the execution statements described in the module, it is stated that "These execution statements are executed only when the module name is first found in the import statement" ^ Note 5, No matter how many times you repeat ʻimport, it will only be executed once. (If you use ʻimportlib and ʻimportlib.reload ()`, it will be executed explicitly)

After reading this post, I think that if you read "Python Tutorial-Modules" again, you will get a better understanding. I will.

I hope it helps you even a little.

Notes on unittest

Added by Comment from @methane (2020/01/20)

__init__.py acts as a marker and is used by unittest to search for test modules below the hierarchy, especially in our immediate surroundings.

An example is shown below.

Unittest test environment tree


./
├─ my_module
│   ├─ __init__.py ............. my_For module mapping
|   └─ _my_module.py ........... my_The entity of module
|
├─ tests-without__init__/ ...... __init__.Test directory without py
|   └─ unit/ ...................Hierarchy for unit testing
|       └─ test_my_module.py ...Test program
|
└─ tests-with__init__/ ......... __init__.Test directory with py
    └─ unit/ ...................Hierarchy for unit testing
        ├─ __init__.py .........Just put the file. The contents are empty.
        └─ test_my_module.py ... tests-without/unit/test_my_module.Symbolic link to py

Install my_module / __ init__.py and my_module / _my_module.py to prepare my_module.

python:my_module/__init__.py&nbsp;-&nbsp;my_As a module_my_module.Load py


__all__ = [
    'Sample',
    ]

from ._my_module import Sample

my_module/_my_module.py&nbsp;-&nbsp;my_Define the entity of module


#!/usr/bin/env python
# -*- coding: utf-8 -*-

import six

class Sample:
    def __init__(self):
        self.message = "__file__: " + __file__ + ", __name__:" + __name__
        self.saved_message = self.message

    def get(self):  #This test target
        return self.message

if __name__ == "__main__":
    sample = Sample()
    six.print_( sample.get() )

Prepare test_my_module.py under the hierarchy to test using unittest. Do not put __init__.py undertests-without__init__ / unit /.

tests-without__init__/unit/test_my_module.py


#!/usr/bin/env python
# -*- coding: utf-8 -*-

import sys, os

import unittest

#Environment variable MY_MODULE_Read PATH, sys.Set to path
module_dir = os.getenv( 'MY_MODULE_PATH', default=os.getcwd() )
sys.path.append( module_dir )

from my_module import Sample

class myClass(unittest.TestCase):
    global module_dir

    def setUp(self):
        self.moduledir = os.path.join( module_dir, "my_module" )
        self.modulefilepath = os.path.join( self.moduledir, "_my_module.py" )
        self.modulename = "my_module._my_module"
        self.sample = Sample()

    def tearDown(self):
        del self.sample

    # Sample.get()a test of
    def test_get(self):
        self.assertEqual( self.sample.get(), "__file__: " + self.modulefilepath + ", __name__:" + self.modulename )

if __name__ == "__main__":
    unittest.main()

Prepare test_my_module.py under the hierarchy to test using unittest. (Actually, it's a symbolic link, same as the contents of tests-without__init__ / unit / test_my_module.py) Place an empty file __init__.py undertests-with__init__ / unit /.

tests-with__init__/unit/__init__.py


# nothing here

tests-with__init__/unit/test_my_module.py


tests-without__init__/unit/test_my_module.Same content as py(Symbolic link)

Run the test. First of all, from the person who has __init__.py.

__init__.Execution result with py


$ python3 -m unittest discover tests-with__init__/ -v
test_get (unit.test_my_module.myClass) ... ok

----------------------------------------------------------------------
Ran 1 test in 0.001s

OK
$

It says Ran 1 test, I found the test program under ʻunit /` and ran it.

On the other hand, if __init__.py does not exist ...

__init__.Execution result without py


$ python3 -m unittest discover tests-without__init__/ -v

----------------------------------------------------------------------
Ran 0 tests in 0.000s

OK
$

Without __init__.py, it will not find the test program under ʻunit /` and the test will not run.

Recommended Posts

What is Python's __init__.py?
What is namespace
What is copy.copy ()
What is Django? .. ..
What is dotenv?
What is POSIX?
What is Linux
What is klass?
What is SALOME?
What is Linux?
What is python
What is hyperopt?
What is Linux
What is pyvenv
What is __call__
What is Linux
What is Python
What is a distribution?
What is Piotroski's F-Score?
What is Raspberry Pi?
[Python] What is Pipeline ...
What is a terminal?
Where is python's fluentd??
[PyTorch Tutorial ①] What is PyTorch?
What is hyperparameter tuning?
What is JSON? .. [Note]
What is a pointer?
What is ensemble learning?
What is TCP / IP?
What is an iterator?
What is UNIT-V Linux?
[Python] What is virtualenv
What is machine learning?
What is the true identity of Python's sort method "sort"? ??
What is Linux? [Command list]
What is Logistic Regression Analysis?
What is the activation function?
What is the Linux kernel?
Is Python's Object Orientation Retrofit?
What is an instance variable?
What is a decision tree?
What is a Context Switch?
What is Google Cloud Dataflow?
[DL] What is weight decay?
[Python] Python and security-① What is Python?
What is a super user?
Competitive programming is what (bonus)
[Python] * args ** What is kwrgs?
What is a system call
What is the interface for ...
What is Project Euler 3 Acceleration?
What is a callback function?
What is the Callback function?
What is a python map?
What is your "Tanimoto coefficient"?
Python Basic Course (1 What is Python)
What is Python? What is it used for?
What is labeling in financial forecasting?
What is Reduced Rank Ridge Regression?
What is Azure Automation Update Management?
[Python] What is @? (About the decorator)