Comparison of calculation speed by implementation of python mpmath (arbitrary precision calculation) (Note)

background

It's been about two years since the last Post. We implemented various languages in the previous article and compared the calculation speeds. There is a hint that the motivation is to compare the calculation speed by arbitrary precision calculation. Arbitrary precision calculation is mainly implemented in GNU MP (GMP), MPFR (both C and C ++). , The wrappers and extensions of these packages are available in some languages. It is available in mpmath in python (I don't know Decimal in the standard library), and julia and Mathematica (wolfram language) support multiple length operations by default. Both seem to use GMP and MPFR, so the ultimate goal is to know how much overhead and coding ease of implementation is (not reached this time).

By the way, 4x precision calculation is supported from gcc 4.6, and 4x precision calculation should be possible in C, C ++, Fortran, but it is not dealt with here. (In C, the quadruple precision floating point type needed a declaration like __float128, but what about now ...)

Examples that require arbitrary precision

(One-dimensional) [baker's map](https://ja.wikipedia.org/wiki/%E3%83%91%E3%82%A4%E3%81%93%E3%81%AD % E5% A4% 89% E6% 8F% 9B) $ B(x) = \begin{cases} 2x,& 0 \le x \le 1/2 \\\\ 2x-1,& 1/2 < x \le 1 \end{cases} $ Consider $ x \ in [0,1] $ as the initial condition, and the orbital sequence $ \ {x, B (x), B ^ {(2)} = B \ circ obtained by iterative synthesis of $ B $ Considering B (x), \ ldots, B ^ {(n)} (x) \} $, this behaves chaotically (the bread is evenly connected by the operation of B).

For the time being, write (roughly) the code that draws the iterations up to the 3rd floor with appropriate initial conditions.

import numpy as np
import matplotlib.pyplot as plt
import time

def B(x):
    return 2*x if x <= 1/2 else (2*x - 1)

fig, ax = plt.subplots(1, 1, figsize=(5,5))
ax.set_xlabel("x")
ax.set_ylabel("B(x)")

x = np.linspace(0, 1, 300)
ax.set_ylim(-0.02,1.02)
ax.set_xlim(-0.02,1.02)
ax.plot(x,2*x,"-y")
ax.plot(x,2*x-1,"-y")
ax.plot(x,x,"--k")

x = (np.sqrt(5) - 1)/2 

web = [np.array([x]),
        np.array([0])]
traj = np.array([x])

nmax = 3

bboxprop = dict(facecolor='w', alpha=0.7)
ax.text(x+0.05,0, r"$x=%f$" % x, bbox=bboxprop)
a = time.time()
for n in range(1,nmax+1):
    y = B(x)
    web[0] = np.append(web[0], x)
    web[1] = np.append(web[1], y)
    traj = np.append(traj, y)
    ax.text(x+0.02, y, r"$f^{(%d)}(x)$" % n, bbox=bboxprop)

    x = y
    if n != nmax:
        ax.text(y+0.02, y, r"$x=f^{(%d)}(x)$" % n, bbox=bboxprop)
        
        web[0] = np.append(web[0], x)
        web[1] = np.append(web[1], y)        

print(time.time() - a, "[sec.]")
ax.plot(web[0][:], web[1][:], "-o")
ax.plot(web[0][0], web[1][0], "or")
ax.plot(web[0][-1], web[1][-1], "or")
ax.grid()
print(traj)
plt.savefig("web.png ")
plt.show()

If you execute, you will get the orbital sequence of $ \ {x, B (x), B \ circ B (x), B \ circ B \ circ B (x) \} $.

$ python baker.py 
0.0013127326965332031 [sec.]
[0.61803399 0.23606798 0.47213595 0.94427191]

Also, I think that the figure (web diagram or phase portrait) can visually understand the meaning of iterated synthesis (start point and end point are circled in red).

web.png

Now, what if we set the number of iterations to 50?

nmax = 3

To

nmax = 50

And run

$ python baker-2.py
0.0012233257293701172 [sec.]
[0.61803399 0.23606798 0.47213595 0.94427191 0.88854382 0.77708764
 0.55417528 0.10835056 0.21670112 0.43340224 0.86680448 0.73360896
 0.46721792 0.93443584 0.86887168 0.73774336 0.47548671 0.95097343
 0.90194685 0.8038937  0.60778741 0.21557482 0.43114964 0.86229928
 0.72459856 0.44919711 0.89839423 0.79678845 0.59357691 0.18715382
 0.37430763 0.74861526 0.49723053 0.99446106 0.98892212 0.97784424
 0.95568848 0.91137695 0.82275391 0.64550781 0.29101562 0.58203125
 0.1640625  0.328125   0.65625    0.3125     0.625      0.25
 0.5        1.         1.        ]

web-2.png

The result of was obtained. Since $ B (1) = 1 $ by definition, a sufficiently long iterative synthesis $ B ^ {(n)} (x) $ with the initial condition $ x = (\ sqrt {5} -1) / 2 $ is set to 1. Expected to converge from numerical calculation (bread does not mix evenly ...)

However, this prediction based on numerical calculation was erroneously derived due to insufficient calculation accuracy. This system behaves chaotically for almost all initial points except for special initial conditions ($ x = 0,1 / 2,1 $, etc.), making $ x \ in [0,1] $ dense. fill in.

To confirm this, the same calculation is performed using mpmath, which is a python arbitrary precision package.

import numpy as np
import matplotlib.pyplot as plt
import mpmath
import time
mpmath.mp.dps = 100

def B(x):
    return 2*x if x <= 1/2 else (2*x - 1)

fig, ax = plt.subplots(1, 1, figsize=(5,5))
ax.set_xlabel("x")
ax.set_ylabel("B(x)")

x = np.linspace(0, 1, 300)
ax.set_ylim(-0.02,1.02)
ax.set_xlim(-0.02,1.02)
ax.plot(x,2*x,"-y")
ax.plot(x,2*x-1,"-y")
ax.plot(x,x,"--k")

x = (np.sqrt(5) - 1)/2 

web = [np.array([x]),
        np.array([0])]
traj = np.array([x])

x = (mpmath.sqrt(5) - 1)/2 
y = B(x)

web = [np.array([x]),
        np.array([0])]

traj = np.array([x])

nmax = 50
a = time.time()
for n in range(1,nmax+1):
    y = B(x)
    web[0] = np.append(web[0], x) 
    web[1] = np.append(web[1], y)
    traj = np.append(traj, y)    
    x = y
    if n!=nmax:
        web[0] = np.append(web[0], x)
        web[1] = np.append(web[1], y)    
print(time.time() - a, "[sec.]")

ax.plot(web[0], web[1], "-o")
ax.plot(web[0][0], web[1][0], "or")
ax.plot(web[0][-1], web[1][-1], "or")
ax.grid()

mpmath.nprint(traj.tolist())
plt.savefig("web-mp1.png ")
plt.show()

The essential change is

import mpmath
mpmath.mp.dps = 100

And initial conditions

x = (mpmath.sqrt(5) - 1)/2 

Only. mpmath.mp.dps = 100 specifies that the evaluation is performed with a precision of 100 digits in the mantissa. (By default, double precision (15) is specified.) Since print (traj) outputs 100 digits as it is, convert the traj of mpmath to a list once and use nprint to reduce the number of displayed digits. The output result is

$ python baker_mp.py 
0.0019168853759765625 [sec.]
[0.618034, 0.236068, 0.472136, 0.944272, 0.888544, 0.777088, 0.554175, 0.108351, 0.216701, 0.433402, 0.866804, 0.733609, 0.467218, 0.934436, 0.868872, 0.737743, 0.475487, 0.950973, 0.901947, 0.803894, 0.607787, 0.215575, 0.43115, 0.862299, 0.724599, 0.449197, 0.898394, 0.796788, 0.593577, 0.187154, 0.374308, 0.748615, 0.49723, 0.994461, 0.988921, 0.977842, 0.955685, 0.911369, 0.822739, 0.645478, 0.290956, 0.581912, 0.163824, 0.327647, 0.655294, 0.310589, 0.621177, 0.242355, 0.48471, 0.96942, 0.93884]

web-mp1.png

Unlike the calculation result of double precision, it does not converge to 1 in 50 steps. The calculation speed is slightly slower, but it is within the permissible range. Let's evolve time longer Let's evolve time with nmax = 350 The output also becomes longer according to the number of steps, so if you display the last 30 steps or later

mpmath.nprint(traj.tolist()[-30:])
$ python baker_mp.py 
0.013526201248168945 [sec.]
[0.0256348, 0.0512695, 0.102539, 0.205078, 0.410156, 0.820313, 0.640625, 0.28125, 0.5625, 0.125, 0.25, 0.5, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]

It becomes. Even in the calculation of 100 digits of the mantissa, if the number of steps is long enough, it seems to asymptotic to 1. What happens if the accuracy is further improved?

mpmath.mp.dps = 200

Execute as.

$ python baker_mp.py 
0.013622760772705078 [sec.]
[0.0256048, 0.0512095, 0.102419, 0.204838, 0.409676, 0.819353, 0.638705, 0.27741, 0.55482, 0.109641, 0.219282, 0.438564, 0.877127, 0.754255, 0.50851, 0.017019, 0.0340381, 0.0680761, 0.136152, 0.272304, 0.544609, 0.0892179, 0.178436, 0.356872, 0.713743, 0.427487, 0.854974, 0.709947, 0.419894, 0.839788]

This time, it did not appear to be asymptotic to 1. In order to accurately obtain long-term time evolution, the accuracy must be high.

Computation speed comparison by several implementations

Since the implementation of the above example uses np.append, the calculation speed becomes extremely slow. In my environment

mpmath.mp.dps = 100000
nmax = 10000

When executed with

$ python baker_mp.py 
1.895829439163208 [sec.]
[0.170384, 0.340767, 0.681535, 0.36307, 0.72614, 0.452279, 0.904559, 0.809118, 0.618235, 0.236471, 0.472941, 0.945882, 0.891765, 0.78353, 0.56706, 0.13412, 0.26824, 0.536479, 0.0729585, 0.145917, 0.291834, 0.583668, 0.167335, 0.334671, 0.669341, 0.338683, 0.677365, 0.35473, 0.70946, 0.418921]

It takes a little less than 2 seconds. Although append is convenient, it is slow because it is an operation that repeats reacquisition of an array (repeating malloc and copy). If you prepare the necessary array first and complete it with just the assignment operation. It should be faster.

Since mpmath defines a matrix class, let's use it. However, since it is a python list that corresponds to a one-dimensional array, compare the case of list with the case of numpy.array. The following code shows only the changed parts


nmax = 10000
web = mpmath.zeros(2*nmax+1,2)
web[0,0] = x
web[0,1] = 0
traj = [mpmath.mpf("0")]*(nmax+1)
traj[0] = x

a = time.time()

for n in range(1,nmax+1):
    y = B(x)
    web[2*n - 1, 0] = x
    web[2*n - 1, 1] = y
    traj[n] = y
    x = y            
    if n!=nmax:
        web[2*n, 0] = x
        web[2*n, 1] = y

print(time.time()-a, "[sec.]")
mpmath.nprint(traj[-30:])
$ python baker_mp1.py 
0.355421781539917 [sec.]
[0.170384, 0.340767, 0.681535, 0.36307, 0.72614, 0.452279, 0.904559, 0.809118, 0.618235, 0.236471, 0.472941, 0.945882, 0.891765, 0.78353, 0.56706, 0.13412, 0.26824, 0.536479, 0.0729585, 0.145917, 0.291834, 0.583668, 0.167335, 0.334671, 0.669341, 0.338683, 0.677365, 0.35473, 0.70946, 0.418921]

Next, try covering traj with numpy.array

traj = np.array([mpmath.mpf("0")]*(nmax+1))

dtype becomes object. All digits are output unless the output is returned to the list once.

mpmath.nprint(traj[-30:].tolist())

The execution result seems to be almost the same.

$ python baker_mp2.py 
0.36023831367492676 [sec.]
[0.170384, 0.340767, 0.681535, 0.36307, 0.72614, 0.452279, 0.904559, 0.809118, 0.618235, 0.236471, 0.472941, 0.945882, 0.891765, 0.78353, 0.56706, 0.13412, 0.26824, 0.536479, 0.0729585, 0.145917, 0.291834, 0.583668, 0.167335, 0.334671, 0.669341, 0.338683, 0.677365, 0.35473, 0.70946, 0.418921]

Next, it's complicated, but put numpy.array in list.

web = [np.array([mpmath.mpf(0)]*2*(nmax+1)),
        np.array([mpmath.mpf(0)]*2*(nmax+1))]
traj = np.array([mpmath.mpf(0)]*(nmax+1))
traj[0] = x
        
a = time.time()

for n in range(1,nmax+1):
    y = B(x)
    web[0][2*n - 1] = x
    web[1][2*n - 1] = y
    traj[n]  = y
    x = y
    if n != nmax:
        web[0][2*n] = x
        web[1][2*n] = y    

print(time.time()-a, "[sec.]")
mpmath.nprint(traj[-30:].tolist())

The execution result is a little faster.

$ python baker_mp3.py 
0.2923922538757324 [sec.]
[0.170384, 0.340767, 0.681535, 0.36307, 0.72614, 0.452279, 0.904559, 0.809118, 0.618235, 0.236471, 0.472941, 0.945882, 0.891765, 0.78353, 0.56706, 0.13412, 0.26824, 0.536479, 0.0729585, 0.145917, 0.291834, 0.583668, 0.167335, 0.334671, 0.669341, 0.338683, 0.677365, 0.35473, 0.70946, 0.418921]

with numpy.array

web = np.zeros((2*nmax+1,2),dtype="object")
web[0,0] = x
web[0,1] = 0
traj = np.zeros(nmax+1, dtype="object")
traj[0] = x

a = time.time()
for n in range(1,nmax+1):
    y = B(x)
    web[2*n - 1, 0] = x
    web[2*n - 1, 1] = y
    traj[n] = y
    x = y            
    if n!=nmax:
        web[2*n, 0] = x
        web[2*n, 1] = y


print(time.time()-a, "[sec.]")
mpmath.nprint(traj[-30:].tolist())

Even if it does not change much.

$ python baker_mp4.py 
0.2934293746948242 [sec.]
[0.170384, 0.340767, 0.681535, 0.36307, 0.72614, 0.452279, 0.904559, 0.809118, 0.618235, 0.236471, 0.472941, 0.945882, 0.891765, 0.78353, 0.56706, 0.13412, 0.26824, 0.536479, 0.0729585, 0.145917, 0.291834, 0.583668, 0.167335, 0.334671, 0.669341, 0.338683, 0.677365, 0.35473, 0.70946, 0.418921]

Save the data in double precision.

web = np.zeros((2*nmax+1,2),dtype=np.float)
web[0,0] = float(x)
web[0,1] = 0
traj = np.zeros(nmax+1, dtype=np.float)
traj[0] = float(x)

a = time.time()
for n in range(1,nmax+1):
    y = B(x)
    web[2*n - 1, 0] = float(x)
    web[2*n - 1, 1] = float(y)
    traj[n] = float(y)
    x = y            
    if n!=nmax:
        web[2*n, 0] = float(x)
        web[2*n, 1] = float(y)

print(time.time()-a, "[sec.]")
print(traj[-30:])

There is almost no change in speed.

$ python baker_mp5.py 
0.3071897029876709 [sec.]
[0.17038373 0.34076746 0.68153493 0.36306985 0.72613971 0.45227941
 0.90455883 0.80911766 0.61823531 0.23647062 0.47294124 0.94588249
 0.89176498 0.78352995 0.5670599  0.13411981 0.26823961 0.53647923
 0.07295846 0.14591691 0.29183383 0.58366766 0.16733532 0.33467064
 0.66934128 0.33868256 0.67736512 0.35473024 0.70946048 0.41892095]

Conclusion

I tried various implementation methods, but concluded that python basically does not change in speed unless append is used.

This time, only one initial condition was targeted, but sampling of the initial condition is required when calculating the statistic. I should do it in that case as well, but this time I will do my best.

reference

Recommended Posts

Comparison of calculation speed by implementation of python mpmath (arbitrary precision calculation) (Note)
Speed comparison of Python XML parsing
[Python3] Coarse graining of numpy.ndarray Speed comparison etc.
[Python] Comparison of Principal Component Analysis Theory and Implementation by Python (PCA, Kernel PCA, 2DPCA)
Python implementation comparison of multi-index moving averages (DEMA, TEMA)
[Scientific / technical calculation by Python] Basic operation of arrays, numpy
Python, Java, C ++ speed comparison
Calculation of similarity by MinHash
Python implementation of particle filters
Implementation of quicksort in Python
Speed comparison of Wiktionary full text processing with F # and Python
[Python] Implementation of Nelder–Mead method and saving of GIF images by matplotlib
Geometry> Clockwise angle calculation of two vectors> Link: python implementation / C implementation
Stack processing speed comparison by language
Expansion by argument of python dictionary
Python implementation of self-organizing particle filters
Implementation of life game in Python
Implementation of desktop notifications using Python
Python implementation of non-recursive Segment Tree
Behavior of python3 by Sakura's server
Implementation of Light CNN (Python Keras)
Implementation of original sorting in Python
Implementation of Dijkstra's algorithm with python
[Python] Calculation of Kappa (k) coefficient
Speed improvement by self-implementation of numpy.random.multivariate_normal
Story of power approximation by Python
I replaced the numerical calculation of Python with Rust and compared the speed
Derivation of multivariate t distribution and implementation of random number generation by python
[Scientific / technical calculation by Python] Fitting by nonlinear function, equation of state, scipy
Speed comparison of word-separation in Python / janome, sudachi, ginza, mecab, fugashi, tinysegmenter
[Scientific / technical calculation by Python] Calculation of matrix product by @ operator, python3.5 or later, numpy
[Control engineering] Calculation of transfer functions and state space models by Python