[PYTHON] Use numpy.ndarray for multiple length evaluation

Preface

In the python3 system, integers can be handled infinite digits as standard, and if the mpmath package is used, multiple length numerical operations of real numbers (complex numbers) are possible. Personally, I use it in combination with numpy.array, but I need to be careful, so I will record the points to be noted as a personal memorandum.

Demo of multiple length evaluation

>>> import python 
>>> math.factorial(100)                                                                                     
93326215443944152681699238856266700490715968264381621468592963895217599993229915608941463976156518286253697920827223758251185210916864000000000000000000000000
>>> import mpmath
>>> mpmath.mp.dps = 100
>>> print(mpmath.pi)                                                                                            
3.141592653589793238462643383279502884197169399375105820974944592307816406286208998628034825342117068

Confirmation of accuracy

You can check the accuracy of python standard float with the following command

>>> import sys 
>>> sys.float_info                                                          
sys.float_info(max=1.7976931348623157e+308, max_exp=1024, max_10_exp=308,
			   min=2.2250738585072014e-308, min_exp=-1021, min_10_exp=-307, 
			   dig=15, mant_dig=53, epsilon=2.220446049250313e-16, radix=2, rounds=1)

epsilon is [Computer Epsilon](https://ja.wikipedia.org/wiki/%E8%A8%88%E7%AE%97%E6%A9%9F%E3%82%A4%E3%83%97% E3% 82% B7% E3% 83% AD% E3% 83% B3), which is the smallest number next to 1. I read that numpy's array (ndarray) is implemented in C language, so I read that it is a C language type. Be constrained. Actually, dtype is specified when creating numpy.array.

In case of numpy, you can check with finfo (floating point), iinfo (integer) [numpy: overflow-error](https://docs.scipy.org/doc/numpy/user/basics.types.html#overflow -errors)

>>> import numpy as np                                                          
>>> np.finfo(np.float)                                                             
finfo(resolution=1e-15, min=-1.7976931348623157e+308, max=1.7976931348623157e+308, dtype=float64)
>>> np.finfo(np.float128)                                                                                                                                                             
finfo(resolution=1e-18, min=-1.189731495357231765e+4932, max=1.189731495357231765e+4932, dtype=float128)
>>> np.iinfo(np.int)                                                                                                
iinfo(min=-9223372036854775808, max=9223372036854775807, dtype=int64)

You can check from. float128 is a long double in C, not float128 (4x double). The behavior of Overflow

>>> np.array([np.finfo(np.float).max], dtype=np.float) * 2  
array([inf])
>>> np.array([np.iinfo(np.int).max], dtype=np.int) + 1                                                              
array([-9223372036854775808])

It becomes. Floating point has a double precision barrier in python, so it seems unavoidable to overflow with respect to float. However, since python3 can handle ints with infinite precision, it may inadvertently cause overflow or unintended lack of precision. If you want numpy.array to handle integers that exceed the precision limit of np.int, you can either leave dtype unspecified or specify object for dtype.

In other words

import math
>>> np.array([math.factorial(25)], dtype=np.int)                                                                    
Traceback (most recent call last):
  File "<ipython-input-46-0451d8765c1b>", line 1, in <module>
    np.array([math.factorial(25)], dtype=np.int)
OverflowError: Python int too large to convert to C long

But

>>> np.array([math.factorial(25)])                                                                                  
array([15511210043330985984000000], dtype=object)

It becomes. If dtype is specified (not specified) for object, multiple floating point numbers such as mpmath can be handled.

Use numpy.array with multiple multiple floating point numbers (mpmath)

As in the case of int, dtype should be unspecified (or dtype = object).

>>> import mpmath                                                                                                   
>>> mpmath.mp.dps = 50                                                                                             
>>> np.array([mpmath.mpf("1")])/3                                                                                   
array([mpf('0.33333333333333333333333333333333333333333333333333311')], dtype=object)

In the past, if you changed the order of operations, you might get angry that it was undefined, but now you can operate almost the same four arithmetic operations without being aware of whether dtype is object, np.float, or np.int (array). If the elements of are capable of four arithmetic operations). Of course, if you perform an operation with an array whose dtype is np.float (or np.int), the accuracy will be lost.

>>> np.array(mpmath.linspace(0, 1, 10)) + np.linspace(0, 10, 10)                                                    
array([mpf('0.0'),
       mpf('1.2222222222222222715654677611180684632725185818142358'),
       mpf('2.4444444444444445431309355222361369265450371636284716'),
       mpf('3.6666666666666668146964032833542053898175557454427101'),
       mpf('4.8888888888888890862618710444722738530900743272569433'),
       mpf('6.1111111111111109137381289555277261469099256727430567'),
       mpf('7.3333333333333336293928065667084107796351114908854202'),
       mpf('8.555555555555556345047484177889095412360297309027773'),
       mpf('9.7777777777777781725237420889445477061801486545138865'),
       mpf('11.0')], dtype=object)

Comparison of calculation speed

In numpy.array, dtype = np.int and np.float will give the best perfomance of calculation speed. Check how slow it will be if dtype = object.

import numpy as np
import mpmath
import time

def addsub(x, imax):
    x0 = np.copy(x)
    time0 = time.time()
    for n in range(imax):
        xx = x + x
        x = xx - x
    print(time.time()-time0, "[sec]", sum(x-x0), x.dtype)

def powpow(x, imax):
    x0 = np.copy(x)
    time0 = time.time()
    for n in range(imax):
        xx = x**(3)
        x = xx**(1/3)
    print(time.time()-time0, "[sec]", float(sum(x-x0)), x.dtype)

imax = 10000
sample = 10000

x = np.arange(0, sample, 1, dtype=np.int)
addsub(x, imax)

x = np.arange(0, sample, 1, dtype=object)
addsub(x, imax)


x = np.arange(0, sample, 1, dtype=np.float)
powpow(x, imax)

x = np.arange(0, sample, 1, dtype=object)
powpow(x, imax)

Next is the execution result

0.06765484809875488 [sec] 0 int64
3.387320041656494 [sec] 0 object
8.931994915008545 [sec] -0.00024138918300131706 float64
16.408015489578247 [sec] -0.00024138918300131706 object

When the element of array is int, it tends to be overwhelmingly slow when specified as object, but on the other hand, float is about twice as much as the allowable range. Now consider the case of mpmath. Since the default mantissa part of mpmath has 15 digits, let's compare the calculation speed between the 15 digits and the mantissa part 100 digits. However, with the above settings, it will take several hours, so imax will be dropped by two digits. Therefore, if you multiply the calculation time by 100 to compare with the above result, you can roughly compare.

imax = 100

mpmath.mp.dps=100
x = np.array(mpmath.arange(0, sample, 1), dtype=object)
powpow(x, imax)

Next is the execution result

15.098075151443481 [sec] -2.4139151442170714e-06 object

Since the amount of calculation is reduced to 1/100 with mpmath and 15 digits of the mantissa, it takes about 100 times.

By the way, the definition of powpow does not improve the accuracy. Because when $ x ^ {1/3} $ is set, the exponent part is evaluated by the float of python, so the numbers after 15 digits are meaningless. So I modified powpow a little

def papawpowpow(x, imax):
    x0 = np.copy(x)
    time0 = time.time()
    for n in range(imax):
        xx = x**(3)
        x = xx**(mpmath.mpf("1/3"))
    print(time.time()-time0, "[sec]", float(sum(x-x0)), x.dtype)

When executed as

21.777454376220703 [sec] 1.5530282978129947e-91 object
mpmath.mp.dps=100
x = np.array(mpmath.arange(0, sample, 1), dtype=object)
papawpowpow(x, imax)

By the way, if you evaluate with elementwise without using array

def elewisepapawpowpow(x, imax):
    x0 = x[:]
    time0 = time.time()        
    for n in range(imax):
        xx = [x1**(3) for x1 in x]
        x =  [x1**(mpmath.mpf("1/3")) for x1 in xx]
    diff = [x[i] - x0[i] for i in range(len(x0))]
    print(time.time()-time0, "[sec]", float(sum(diff)), type(x))            

It becomes.

28.9152569770813 [sec] 1.5530282978129947e-91 <class 'list'>

The calculation speed is a little less than 30% faster than the case of numpy.array, and it is cleaner to use numpy.array, which is familiar with coding.

Execution environment

$ lscpu 
architecture:                      x86_64
CPU operation mode:                      32-bit, 64-bit
Byte order:                          Little Endian
CPU:                                 4
List of CPUs that are online: 0-3
Number of threads per core:              1
Number of cores per socket:              4
Number of sockets:                          1
Number of NUMA nodes:                       1
Vendor ID:                         AuthenticAMD
CPU family:                      23
model:                              17
Model name:                            AMD Ryzen 3 2200G with Radeon Vega Graphics
stepping:                        0
CPU MHz:                             1479.915
CPU maximum MHz:                        3500.0000
CPU minimum MHz:                        1600.0000
BogoMIPS:                            7000.24
Virtualization:                              AMD-V
L1d cache:                      32K
L1i cache:                      64K
L2 cache:                       512K
L3 cache:                       4096K
NUMA node 0 CPU:                   0-3
flag:                              fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb hw_pstate sme ssbd sev ibpb vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushopt sha_ni xsaveopt xsavec xgetbv1 xsaves clzero irperf xsaveerptr arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif overflow_recov succor smca
$ cat /proc/cpuinfo | grep MHz
cpu MHz		: 1624.805
cpu MHz		: 1500.378
cpu MHz		: 3700.129
cpu MHz		: 2006.406

Recommended Posts

Use numpy.ndarray for multiple length evaluation
EP 26 Use Multiple Inheritance Only for Mix-in Utility Classes
Use PySide for HDA UI