[PYTHON] Speeding up numerical calculations with NumPy: Basics

Target

It is written for beginners of Python and NumPy. It may be especially useful for those who aim at numerical calculation of physics that corresponds to "I can use C language but recently started Python" or "I do not understand how to write like Python". Maybe.

Also, there may be incorrect statements due to my lack of study. Please forgive me.

Summary

Python has a very rich library of numerical calculations and is easy to use. Many of them are heritage wrappers written in Fortran, but they are much easier than calling them from C / C ++ etc. For example, if you try to calculate a eigenvalue problem by calling LAPACK [^ 1] from C / C ++

info = LAPACKE_zheevr(LAPACK_COL_MAJOR, 'V', 'A', 'L', H, (lapack_complex_double*)a, lda, vl, vu, il, iu, abstol, &mm, w, (lapack_complex_double*)z, ldz, isuppz);

There are too many arguments and it is complicated. Although the more arguments increase the degree of freedom of calculation, it does not seem to be very for beginners. Also, C / C ++ does not natively support complex. Since, types such as lapack_complex_double are defined inside the wrapper, which can be confusing. On the other hand, it is very concise in Python:

w, v = numpy.linalg.eig(H)

Arguments like those found in LAPACK are also provided properly [^ 2], but the default values are set, so if you just want to solve it without thinking about the details, you can just do this. It's simple and good. By the way, various solvers seem to have higher performance in SciPy than in NumPy.

This makes me want to leave various numerical calculations to Python, but this child also has a drawback. ** For loops are deadly slow. ** In so-called matrix operations, triple loops frequently occur. However, numerical calculation is basically "finely differentiated and looped", so it is quite fatal [^ 3]. Although it may not be a compile language, it seems to be slower than other LL languages. For matrix operations

for i in range(N):
	for j in range(N):
		for k in range(N):
			c[i][j] = a[i][k] * b[k][j]

I wrote it and it's over.

NumPy built-in functions and operations

So what to do is take advantage of NumPy's universal function. ** NumPy's built-in function works on each element of the NumPy array ndarray and returns a new ndarray. . ** This is very important and has the potential to drive out for loops in tensor operations. For example, math.sin (cmath.sin) can only pass scalars, but numpy.sin

>>> theta = numpy.array([numpy.pi/6, numpy.pi/4, numpy.pi])
>>> numpy.sin(theta) 
array([  5.00000000e-01,   7.07106781e-01,   1.22464680e-16])

And throwing a NumPy array will cause numpy.sin to work on each element. And often this kind of processing is as fast as C / C ++. The main implementation of NumPy is C. And Fortran, and BLAS [^ 4] / LAPACK is doing its best in linear operations.

Also, the four arithmetic operations defined by ndarray are similarly fast and diverse.

Part 1: Vector

Let's write a linear operation related to a vector without a for loop. Since it is expected that functions such as inner product and outer product are properly prepared, how are the sum, product, quotient, etc. for ndarray mainly described below? Let's see if it is defined. ** The operation on the ndarray is completely different from the operation on the Python built-in list ** After that, alias is set like ʻimport numpy as np`.

Vector and scalar

A one-dimensional ndarray can be considered a kind of vector. Scalar (complex, float, int) works on all elements of ndarray:

>>> a = np.array([1, 2, 3])
>>> a + 2
array([3, 4, 5])
>>> a * (3 + 4j)
array([ 3. +4.j,  6. +8.j,  9.+12.j])

Vectors

ndarray works with elements with matching indexes:

>>> a = np.array([1, 2, 3])
>>> b = -np.array([1, 2, 3])
>>> a + b
array([0, 0, 0])
>>> a * b
array([-1, -4, -9])
>>> a / b
array([-1., -1., -1.])

It is interesting that it is neither an inner product nor an outer product. It is possible to define division. It is often used when performing a differential operation on a certain differentiated function. The sum / product of ndarrays with different dimensions is not defined.

Reshaped ndarrays

ndarray has a method called reshape, for example, you can rearrange a 3x1 vector into a 1x3. It's like a horizontal vector to a vertical vector, but a product (*). There is no simple correspondence because the definition of is different from that of vector space. The product with this reshaped vector is very interesting:

>>> c = a.reshape(3, 1)
>>> c
array([[1],
       [2],
       [3]])
>>> c + b
array([[ 0, -1, -2],
       [ 1,  0, -1],
       [ 2,  1,  0]])
>>> c * b
array([[-1, -2, -3],
       [-2, -4, -6],
       [-3, -6, -9]])

While having rules like (3 × 1) + or * (1 × 3) = (3 × 3), but like the previous operation, ** this operation is commutative **. It seems strange at first glance. It may be, but if you take a look, you can see that it can be explained using the rules in the previous section. With this, there is a possibility that even the initialization of the matrix does not need to use the for loop. Masu **. It is very useful for tasks such as initializing a huge matrix every time the parameters are changed and proceeding with the calculation.

I personally like the very flexible specifications that are not tied to these types [^ 5]. I think it's more intuitive than other languages.

Vector initialization

It's not an operation, but there's no point in using a for loop to prepare a vector in the first place. There are many functions that create ndarray, but here are some of the ones I use most often:

>>> L = 1
>>> N = 10
>>> np.linspace(-L/2, L/2, N)
array([-0.5       , -0.38888889, -0.27777778, -0.16666667, -0.05555556,
        0.05555556,  0.16666667,  0.27777778,  0.38888889,  0.5       ])

>>> dL = 0.2
>>> np.arange(-L/2, L/2, dL)
array([-0.5, -0.3, -0.1,  0.1,  0.3])

>>> np.logspace(0, 1 ,10, base=np.e)
array([ 1.        ,  1.11751907,  1.24884887,  1.39561243,  1.5596235 ,
        1.742909  ,  1.94773404,  2.17662993,  2.43242545,  2.71828183])

If you combine operations with scalars and vectors, it seems quite so even without a for loop.

Part 2: Matrix

The flow of the story is almost the same as that of the vector. Since the matrix product has a dedicated function, the following is also the story of the operation. I think that the operation result can be predicted.

Matrix and scalar

>>> a = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
>>> a
array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])
>>> a + (2 - 3j)
array([[  3.-3.j,   4.-3.j,   5.-3.j],
       [  6.-3.j,   7.-3.j,   8.-3.j],
       [  9.-3.j,  10.-3.j,  11.-3.j]])
>>> a * 0.2
array([[ 0.2,  0.4,  0.6],
       [ 0.8,  1. ,  1.2],
       [ 1.4,  1.6,  1.8]])

Matrix and vector

>>> b = np.array([1, 2, 3])
>>> a + b
array([[ 2,  4,  6],
       [ 5,  7,  9],
       [ 8, 10, 12]])
>>> a * b
array([[ 1,  4,  9],
       [ 4, 10, 18],
       [ 7, 16, 27]])
>>> a / b
array([[ 1. ,  1. ,  1. ],
       [ 4. ,  2.5,  2. ],
       [ 7. ,  4. ,  3. ]])

Matrix and matrix

>>> b = np.array([[-1, 2, -3], [4, -5, 6], [-7, 8, -9]])
>>> a + b
array([[ 0,  4,  0],
       [ 8,  0, 12],
       [ 0, 16,  0]])
>>> a * b
array([[ -1,   4,  -9],
       [ 16, -25,  36],
       [-49,  64, -81]])

Matrix initialization

There are also useful functions for matrices. Below are just a few:

>>> np.identity(4)
array([[ 1.,  0.,  0.,  0.],
       [ 0.,  1.,  0.,  0.],
       [ 0.,  0.,  1.,  0.],
       [ 0.,  0.,  0.,  1.]])

>>> np.eye(4, 3)
array([[ 1.,  0.,  0.],
       [ 0.,  1.,  0.],
       [ 0.,  0.,  1.],
       [ 0.,  0.,  0.]])

>>> c = np.array([1, 2, 3, 4])
>>> np.diag(c)
array([[1, 0, 0, 0],
       [0, 2, 0, 0],
       [0, 0, 3, 0],
       [0, 0, 0, 4]])

in conclusion

I think other people have written better articles about NumPy's useful functions and detailed arithmetic rules. If you look at this article and write it like NumPy, you can do various calculations without using for loops. I hope you can feel that it is possible. If you use these, the numerical calculation algorithms that science college students learn in programming classes will run at a speed comparable to C / C ++. Many hand algorithms are already prepared in SciPy etc.

It's been a little long, so I'd like to summarize the specific application to numerical calculation in another article. Thank you.

** Addendum (2016/11/25): Continued below ** Speeding up numerical calculation using NumPy / SciPy: Application 1 Speeding up numerical calculation using NumPy / SciPy: Application 2

[^ 1]: LAPACK (Linear Algebra PACKage) is a library of linear algebra written in Fortran 90. Some wrappers for C / C ++ are called Lapacke. Because it has an MKL compatible interface, The MKL manual will be helpful. [^ 2]: For more information, see NumPy and [SciPy](https://docs.scipy.org/doc/scipy-0.18. Please refer to the 1 / reference /) manual. [^ 3]: Besides that, the overhead of function call is also big. In the case of Procon, it often happens that "It passes when written by dynamic programming, but TLE in memoized recursion". [^ 4]: A library that contains a more basic set of linear operations than LAPACK. Inside LAPACK we call BLAS. [^ 5]: Some people dislike "no type". Unlike programs designed to meet "certain specifications", numerical calculations in physics make it so easy to check if the output is correct. And it's even harder to debug if "variable type changes implicitly" is allowed during program execution. Since the observable amount is fixed to float, static typing creates bugs. I understand the claim that it's difficult, but I like Python.

Recommended Posts

Speeding up numerical calculations with NumPy: Basics
Speeding up numerical calculation using NumPy / SciPy: Application 2
Speeding up numerical calculation using NumPy / SciPy: Application 1
Speeding up numerical calculation using NumPy / SciPy: Picking up fallen ears
See the power of speeding up with NumPy and SciPy
NumPy basics
#Python basics (#Numpy 1/2)
#Python basics (#Numpy 2/2)
Python #Numpy basics
Findings when accelerating numerical calculations with Python and Numba
A note on speeding up Python code with Numba
Reading, displaying and speeding up gifs with python [OpenCV]
Python basics 8 numpy test
Moving average with numpy
Getting Started with Numpy
Learn with Cheminformatics NumPy
Matrix concatenation with Numpy
Hamming code with numpy
Regression analysis with NumPy
Extend NumPy with Rust
Numerical calculation with Python