I did a lot of research on how Python is executed

I've only used Python as an extension of shell scripts, but recently I've been getting more and more opportunities to use it for more decent purposes. Then, as a low-layer enthusiast, I was wondering how Python was executed. So I picked it up and ate it.

Python implementation

The story that the language specification and implementation are different things.

There is only one language, Python, but there is more than one way to achieve the functionality of that language. CPython and PyPy are the names for implementations of the language Python. Speaking of C language, it seems that GCC or Clang can be selected as the compiler. For the types of Python implementations, see Python Wikipedia. There are quite a lot.

So, among various implementations, CPython is the so-called reference implementation, which is implemented by the original author of Python, and the Python execution environment in the world is roughly this, which is exactly the original position.

As you can imagine from the name, CPython is implemented in C language, and PyPy is implemented in Python. And PyPy is faster than CPython. Hmm? What does it mean that PyPy is faster than CPython when C is faster than Python? I would like to explain a little about that.

How CPython works

If you read Wikipedia on CPython,

CPython is a bytecode interpreter.

a. Bytecode interpreter. What is it?

Bytecode is an intermediate representation. In other words, under the CPython environment, Python is first converted to bytecode, and the bytecode is executed by the virtual machine (VM). It is called a bytecode interpreter because it interprets and executes bytecode sequentially.

The reason for doing that seems to be that it's faster, but I'm not sure why the total execution time is decisively faster with bytecode in the language running on the interpreter. There wasn't. However, at least the implementation of the interpreter will be refreshing, and if you leave the bytecode as a cache, you will not have to do most of the processing such as parsing from the second time onward, so it definitely seems to make sense. When you execute Python code, .pyc files and \ _ \ _ pycache \ _ \ _ directories are created, but it seems that bytecodes are recorded in these. It seems possible to take only these bytecodes to another environment and execute them.

that? By the way, there is a famous language with such specifications. Yes, Java. At the beginning of the Java description, there is a description that Java code is converted to bytecode and Jave VM executes it. In both Python and Java, the source code is converted (compiled) into bytecode and then the VM (interpreter) is executing. Java is recognized as a compilation language and Python as an interpreter language, but the reality is that compilation is done explicitly or implicitly.

Why PyPy is so fast

Why is PyPy written in Python faster than CPython? This is because it is JIT (Just In Time) compilation.

What is JIT compilation? Roughly speaking, when you execute it, it is compiled into machine language and then executed, so it will be faster. For example, consider looping and a function that is called many times. If it is a simple interpreter, the grammar will be interpreted each time those codes are called, and the interpreter will execute the processing based on the contents. Since the actual state of the interpreter is naturally a collection of machine languages, after all, it is like executing the machine language after performing the code ⇒ machine language conversion every time. Then, if the code that is called repeatedly is converted into machine language at once, and the machine language is executed directly when the same code is called, the conversion processing time can be reduced. In addition, an interpreter that converts code line by line cannot be optimized based on the processing flow, but if you read and convert the code in a batch to some extent, you may be able to perform some optimization. ..

However, there are various methods for speeding up by JIT compilation, and I honestly don't understand what is the key to speeding up. Moreover, PyPy seems to take a special method of JIT compiling the processing code, and it seems difficult to understand the contents.

By the way, if you look at the PyPy download page, it is stated that the JIT compiler works only on Intel CPUs.

These binaries include a Just-in-Time compiler. They only work on x86 CPUs that have the SSE2 instruction set (most of them do, nowadays), or on x86-64 CPUs. They also contain stackless extensions, like greenlets.

I'm guessing because I haven't checked the source, but JIT compilation means that there is a process in the language processing system that generates an assembly that depends on the CPU architecture. Implementing processing that corresponds to many CPU architectures in the world is a difficult task just to think about. It may be implemented only for Intel CPUs with a large number of users.

If you use PyPy, when you execute Python code, PyPy will read and execute it. So who is running PyPy written in Python? Apparently PyPy's Python code has been converted to C and compiled into binary is running.

Numba JIT compilation support

I found that PyPy has a built-in JIT compiler and is fast, but by the way, Python had a library called Numba that compiles JIT. Looking at the Numba Guide, it seems that it supports a reasonable number of CPU architectures.

Architecture: x86, x86_64, ppc64le. Experimental on armv7l, armv8l (aarch64).

Is Numba working hard to implement architecture-specific support?

After a little research, Numba seems to be using LLVM. If you are using LLVM, if you convert the Python code to LLVM IR (an intermediate representation of LLVM), LLVM will handle each CPU architecture, so there is no need to support it on the Numba side.

Summary

I tried to find out how Python code is executed at will. I felt that there was almost no boundary between the interpreter language and the compiler language. It compiles in the interpreter language for speed, and some compile languages work like an interpreter for convenience. I thought I knew about JIT compilation, but I didn't know the details at all.

reference

How is the Python implementation implemented and how does it work? Is Python interpreted one by one or compiled?

Recommended Posts

I did a lot of research on how Python is executed
I did a little research on the class
I want to start a lot of processes from python
I made a lot of files for RDP connection with Python
Connect a lot of Python or and and
What I did with a Python array
A memo of a tutorial on running python on heroku
Create a Python execution environment on IBM i
In Python, change the behavior of the method depending on how it is called
How many types of Python do you have on your macOS? I had 401 types.
Separately install a version of Python that is not pre-installed on your Mac
Suddenly I needed to work on a project using Python and Pyramid, so a note of how I'm studying
I tried to summarize how to use matplotlib of python
[Example of Python improvement] I learned the basics of Python on a free site in 2 weeks.
Since memory_profiler of python is heavy, I measured it
How to write a list / dictionary type of Python3
Python that merges a lot of excel into one excel
I made a Python3 environment on Ubuntu with direnv.
How to build a Django (python) environment on docker
Basics of Python learning ~ What is a string literal? ~
Python + selenium to GW a lot of e-mail addresses
I tried using Python (3) instead of a scientific calculator
I thought about why Python self is necessary with the feeling of a Python interpreter
Differences in sys.path depending on how Python is started (v3.8.2)
I thought about a Python beginner's course on blockchain games
A memorandum of stumbling on my personal HEROKU & Python (Flask)
[Python] How to make a list of character strings character by character
How to build a new python virtual environment on Ubuntu
I just changed the sample source of Python a little.
The story of how the Python bottle worked on Sakura Internet
How to shuffle a part of a Python list (at random.shuffle)
[python] Reverse with slices! !! (There is also a commentary on slices!)
NikuGan ~ I want to see a lot of delicious meat! !!
A record of hell lessons imposed on beginner Python students
Since Python 1.5 of Discord, I can't get a list of members
I tried "How to get a method decorated in Python"
How to develop in a virtual environment of Python [Memo]
How to register a package on PyPI (as of September 2017)
[Blender x Python] Let's arrange a lot of Susanne neatly !!
How to get a list of built-in exceptions in python
A beginner's summary of Python machine learning is super concise.
I stumbled on TensorFlow (What is Out of GPU Memory)
I tried a stochastic simulation of a bingo game with Python
How Python __dict__ is used
Handling of python on mac
Python list is not a list
I made a python text
I ran python on windows
What is a python map?
[Python] What is a slice? An easy-to-understand explanation of how to use it with a concrete example.
Automation of a research on geographical information such as store network using Python and Web API
How to make Python 3.x and 2.x coexist on Mac (I also included opencv as a bonus)
[Python] Visualize overseas Japanese soccer players on a map as of 2021.1.1
Executing a large number of Python3 Executor.submit may consume a lot of memory
A note on what you did to use Flycheck with Python
[Python] How to force a method of a subclass to do something specific
[Python] I wrote the route of the typhoon on the map using folium
I know, but I can't stop--Python is a collection of common mistakes
I tried to make a regular expression of "amount" using Python
How easy is it to synthesize a drug on the market?
Scripting Language C ——How a text file without a shebang is executed