[PYTHON] Speeding up numerical calculation using NumPy / SciPy: Picking up fallen ears

Target

It is written for beginners of Python and NumPy. It may be especially useful for those who aim at numerical calculation of physics that corresponds to "I can use C language but recently started Python" or "I do not understand how to write like Python". Maybe.

Also, there may be incorrect statements due to my lack of study. Please forgive me.

Summary

Speeding up numerical calculation using NumPy: Basics Speeding up numerical calculation using NumPy / SciPy: Application 1 Speeding up numerical calculation using NumPy / SciPy: Application 2

It's a bonus. It's mainly about small stories.

numpy and math functions

  1. The numpy universal function takes a ndarray as an argument and returns a ndarray. The math function can only take ʻint or float`.

  2. The math function cannot take complex. If you want to pass complex, use cmath. numpy can also take complex.

  3. The numpy function returns a RuntimeWarning if it is out of the domain, or if you give it a terrible value, but it does not return an Exception. The math function returns an exception:

import numpy as np
import math as m

m.log(0) # ValueError: math domain error
mp.log(0) # RuntimeWarning: divide by zero encountered in log
>>> -inf

mp.log(-1) # RuntimeWarning: invalid value encountered in log
>>> nan

I'm glad that np.log (0) returns -inf.

As you can see, the universal function of numpy is versatile, but it seems that math is higher in execution speed. Unless you want to pass ndarray, you should use the math function.

Access to ndarray by list

As I mentioned earlier, you can access the ndarray in a list:

a = [1, 3, 5]
b = [7, 5, 4, 6, 2, 3, 1, 9, 8]
b[a] # TypeError: list indices must be integers or slices, not list

b_np = np.array(b)
b_np[a]
>>> array([5, 6, 3])

There will be some situations that are useful to know.

To better BLAS

Since the basic matrix operation of numpy depends on BLAS (Basic Linear Algebra Subprograms), the speed varies greatly depending on the BLAS used. For example," reference BLAS "that is included in the Ubuntu system as standard is quite good. It's a slow implementation, and you have to use the appropriate BLAS to make numpy itself faster.

"Open BLAS [^ 1]" is one of the first candidates. For details on the introduction method and effects, see here. I was indebted to you.

However, in fact, MLK (Math Kernel Library) [^ 3] is installed as standard in Anaconda [^ 2], and it is very fast as long as you use numpy on it. Those who use NumPy / SciPy should use Anaconda. There seems to be no reason not to use it anymore.

However, if you install Anaconda, the python of the system will be overwritten by the corresponding version of Anaconda, which may cause problems around the system. Therefore, it is better to install it through pyenv etc. About building a Python environment with Anaconda Here is detailed.

Impressions about a subset of Python acceleration

There are a number of frameworks for accelerating Python, and I've tried using some of them.

Cython

Is it a new language to make Python's C extensions easier to use? The basic grammar is Python, but the code compiles to C at run time. To speed things up properly, type variables. You need to tell me the size of the list and the size of the list. If you write it properly, the execution speed is really C.

When the bottleneck of the code is obvious, is it better to make that part a function and rewrite it in Cython? However, variables have types and list sizes are fixed, so it is no longer Python. While I was writing, I thought "If you want to write this, write it in C from the beginning ...". I don't think it will be Pythonic code.

Boost/Python

Boost library that allows you to import C ++ classes and functions from Python as they are. Cython can be written in Python including the part you want to speed up, but it is troublesome to build. You can leave it to us.

However, joints with multiple languages still have a painful type problem. It seems that you can not pass ndarray on the C ++ side. There are frameworks such as Boost / NumPy and PyUblas that can passndarray, but build with Python3 Does not come true.

Numba

A module that speeds up using the JIT compiler. It works with a single decorator, and the module does not straddle more than two like the above two. While I feel the future very much, I feel that the know-how is not ripe. Sometimes it gets faster, sometimes it fails ...

Conclusion

It turns out that NumPy and SciPy are sufficient for scientific calculations.

in conclusion

I may add it every time I come up with something. Thank you very much.

[^ 1]: This is a fork of "Go to BLAS" made by a certain Mr. Goto. It seems that Mr. Goto is currently developing MKL, and the development of Goto BLAS has stopped. There seems to be.

[^ 2]: This is a nice package of science and technology modules.

[^ 3]: A library of various math routines developed by Intel. Contains very fast implementations such as BLAS and LAPACK.

Recommended Posts

Speeding up numerical calculation using NumPy / SciPy: Picking up fallen ears
Speeding up numerical calculation using NumPy / SciPy: Application 2
Speeding up numerical calculation using NumPy / SciPy: Application 1
Speeding up numerical calculations with NumPy: Basics
See the power of speeding up with NumPy and SciPy
[Python] Speeding up processing using cache tools
[Scientific / technical calculation by Python] Solving (generalized) eigenvalue problem using numpy / scipy, using library