Introduction

Memoization was considered using previous Fibonacci series as an example. This time, we consider the implementation of Memoization and the execution speed by language using the Hermite polynomial as an example. The languages handled are python, wolfram language (Mathematica) and julia (in preparation).

First, I would like to point out that the evaluation of super-basic orthogonal polynomial systems in computer science such as Hermite polynomials is provided as a package without being implemented by yourself, so it is overwhelmingly better to use it.

In python

In Wolfram Language

HermiteH

Actually, I really don't understand in julia (2020). I haven't actually used it, but as far as I checked

It seems that a package related to orthogonal polynomials can be used. For special functions

SpecialFunctions

Seems to be available. I don't understand if julia has become the de facto standard of computer science in python like numpy and scipy, but at least I look at Github and it's quite active so I'm looking forward to the future.

Motivation

Previous Fibonacci series, Fibonacci series caches (memoization) the terms evaluated once, and improves the calculation speed of recursive series by eliminating unnecessary duplicate evaluation. did. Similarly, consider the Memoization of a recursive definition function using the Hermite polynomial as an example.

As you can see in https://mathworld.wolfram.com/HermitePolynomial.html, Hermite polynomials can be defined in a variety of ways, but here Let us evaluate the following definition, which is relatively easy to implement as a recursive function.

H_0(x)= 1, \\\\ H_1(x)= 2x, \\\\ H_n(x) = 2xH_{n-1}(x) -2(n-1)H_{n-2}(x) \quad (n >1)\\\\

Evaluating symbolically up to n = 10 in wolfram language

H[0, x_] = 1;
H[1, x_] = 2x;
H[n_, x_] := H[n, x] = 2x*H[n-1, x] - 2(n-1)H[n-2, x];

res = Table["H_"<>ToString[n]<>"(x)="<>ToString@TeXForm@Expand@H[n, x], {n, 0, 10}]//MatrixForm;
Print@res

\begin{array}{l} H_0(x)=1 \\\\ H_1(x)=2 x \\\\ H_2(x)=4 x^2-2 \\\\ H_3(x)=8 x^3-12 x \\\\ H_4(x)=16 x^4-48 x^2 + 12\\\\ H_5(x)=32 x^5-160 x^3+120 x \\\\ H_6(x)=64 x^6-480 x^4+720 x^2-120\\\\ H_7(x)=128 x^7-1344 x^5+3360 x^3-1680 x \\\\ H_8(x)=256 x^8-3584 x^6+13440 x^4-13440 x^2+1680\\\\ H_9(x)=512 x^9-9216 x^7+48384 x^5-80640 x^3+30240 x\\\\ H_{10}(x)=1024 x^{10}-23040 x^8+161280 x^6-403200 x^4+302400 x^2-30240\\\\ \end{array}

As you can see immediately, the Hermite polynomial of degree n depends on $ x $ for $ H_n (x) , so Memoization such as [Last Fibonacci series](https://qiita.com/hanada/items/3a02ea3d311ef7005640) Then there is no point. Also, the value at the even-numbered origin is|H_n(0)|\sim 2(n-1)!!!!$ (Factorial with 4 skips, surprised 4 times!!!!)Because you can see that it is increasing in\exp(-x^2)When trying to evaluate the orthogonal relationship numerically by multiplying by(n-1)!!!!Is n!Although slower than(Probably)Because it increases faster than the exponential functionnThe increase is a calculation of a value larger than the exponential function and a value smaller than the exponential function, and the accuracy is lost.(I think).. Therefore, the motivation is to evaluate the Hermite polynomial with multiple length evaluation.

Simple implementation with python

Evaluating a simple Hermite polynomial as a reference with python (numpy, mpmath) would be as follows.

import numpy as np
import time

def H(n,x):
    if n==0:
        return np.array([1]*len(x))
    elif n==1:
        return 2*x
    else:
        return 2*x*H(n-1, x) - 2*(n-1)* H(n-2,x)


time0 = time.time()
x = np.linspace(-1, 1, 100)
n=30
y = H(n,x)
y0 = y[len(x)//2]
print("numpy:", time.time()-time0, "sec", "H_%d(0)=%.18e" % (n, y0))

n=20
time0 = time.time()
x = np.linspace(-1, 1, 100)
y = H(n,x)
y0 =y[len(x)//2]
print("numpy:", time.time()-time0, "sec", "H_%d(0)=%.18e" % (n, y0))

import mpmath
mpmath.mp.dps = 100
time0 = time.time()
x = np.array(mpmath.linspace(-1, 1, 100), dtype="object")
y = H(n, x)
y0 = float(y[len(x)//2])
print("mpmath:", time.time()-time0, "sec", "H_%d(0)=%.18e" % (n,y0))

$ python hermite-poly.py 
numpy: 10.97638750076294 sec H_30(0)=-2.022226152780266537e+20
numpy: 0.09095430374145508 sec H_20(0)=6.690748809755917969e+11
mpmath: 6.780889987945557 sec H_20(0)=6.690748809755917969e+11

Memoization policy

Since $ H_n (x) $ is a $ x $ polynomial of degree $ n $, we realize that $ x $ has not been evaluated and only the coefficients need to be evaluated. If you use the lambda expression, you can cache $ x $ without being evaluated, so the calculation speed will be improved ... but this is the same efficiency as the above-mentioned simple implementation. This is because the lambda expression only caches the order of calculation operations, and the calculation results remain unevaluated (a specific example will be described later in mpmath.polynomial).

By the way, memoization of lambda expression does not bring about improvement of calculation speed. Therefore, only the coefficient of $ H_n (x) $ is evaluated without using the lamda equation. That is, a polynomial of degree $ m $

p_m(x) = a_0 + a_1x + a_2x^2 + \cdots + a_m x^m

Is given. A list of coefficients of $ m $ in length

L(p_m(x)) = [a_0, a_1, \ldots, a_m]\\\ L(q_m(x)) = [b_0, b_1, \ldots, b_m]

Then $ p_m (x) \ pm q_m (x) $ is

L(p_m(x)) \pm L(q_m(x)) = [a_0\pm b_0,\ldots, a_m\pm b_m]

Multiplication $ x \ times p_m $

L(x) \times L(p_m(x)) = [0, a_0, a_1, a_2, \cdots, a_m]

Equivalent to making a list of length $ m + 1 $. By manipulating this operation, addition / subtraction (division), integration, and differentiation of $ p_m (x) $ can be realized, but this is sufficient for the evaluation of Hermite polynomials. The division is in parentheses because, with the exception of the special $ p_m $, operations are generally not strictly evaluated. If the coefficient list L (p_m (x)) is used, the value evaluated concretely can be cached, so it is expected that calculation improvement by Memoization can be expected.

memoization with numpy

Try some naive implementations using python numpy.

import numpy as np
import time
from functools import lru_cache

def HermiteCoeff(n):
    if n == 0:
        return np.array([1])
    elif n == 1:
        return np.array([0, 2])
    else: 
        coeff = np.zeros(n+1)        
        coeff[:n-1] += -2*(n-1)*HermiteCoeff(n-2)
        coeff[1:] += 2*HermiteCoeff(n-1)
        return coeff
    
known0={0:np.array([1], dtype=np.int64),
        1:np.array([0, 2], dtype=np.int64)}
def HermiteCoeffMemo0(n):
    if n in known0:
        return known0[n]
    else: 
        coeff = np.zeros(n+1, dtype=np.int64)        
        coeff[:n-1] += -2*(n-1)*HermiteCoeffMemo0(n-2)
        coeff[1:] += 2*HermiteCoeffMemo0(n-1)
        known0[n] = coeff
        return coeff

known1={0:np.array([1], dtype=o),
        1:np.array([0, 2], dtype=object)}
def HermiteCoeffMemo1(n):
    if n in known1:
        return known1[n]
    else: 
        coeff = np.zeros(n+1, dtype=object)        
        coeff[:n-1] += -2*(n-1)*HermiteCoeffMemo1(n-2)
        coeff[1:] += 2*HermiteCoeffMemo1(n-1)
        known1[n] = coeff
        return coeff

@lru_cache()
def HermiteCoeffMemo2(n):
    if n == 0:
        return np.array([1], dtype=object)
    elif n == 1:
        return np.array([0, 2], dtype=object)
    else: 
        coeff = np.zeros(n+1, dtype=object)        
        coeff[:n-1] += -2*(n-1)*HermiteCoeffMemo2(n-2)
        coeff[1:] += 2*HermiteCoeffMemo2(n-1)
        return coeff
    
def Hermite(coeff):
    return lambda x:sum(c*x**i for i, c in enumerate(coeff))  

import time

funcs = [HermiteCoeff, HermiteCoeffMemo0, HermiteCoeffMemo1, HermiteCoeffMemo2]
x = np.linspace(-1, 1, 100)
n = 30
for f in funcs:
    time0 = time.time()    
    coeff = f(n)
    H = Hermite(coeff)
    y = H(x)
    y0 = y[len(x)//2]
    print("eval. time:%10f[sec]" % (time.time()-time0), "H_%d(0)=%5e" % (n, y0), "by", f.__name__, coeff.dtype)

The execution result is

eval. time: 11.646824[sec] H_30(0)=-2.022226e+20 by HermiteCoeff float64
eval. time:  0.000543[sec] H_30(0)=7.076253e+16 by HermiteCoeffMemo0 int64
eval. time:  0.000932[sec] H_30(0)=-2.022226e+20 by HermiteCoeffMemo1 object
eval. time:  0.000933[sec] H_30(0)=-2.022226e+20 by HermiteCoeffMemo2 object

As expected, Memoization improves speed, but there are a few things to keep in mind. numpy.zeros returns a zero-initialized array of length $ n $, but with dtype If not specified, it is filled with float zero. In the evaluation of Hermite polynomials, the element of the coefficient list is int, so it is useless. Therefore, the result of setting (Memoization) + dtype = np.int is the second. It is good that the speed is faster, but the result of $ H_ {30} (0) $ is different. This may be overflowed by previous article.

Therefore, the third is to make it possible to handle multiple-precision integers with memoization and array, so it can be confirmed that the result of setting dtype = object is slightly slower than the case of int, but returns the same value as the first definition. The result of setting the degree $ n = 100 $ is as follows.

eval. time:  0.001842[sec] H_100(0)=-2.410424e+18 by HermiteCoeffMemo0 int64
eval. time:  0.003694[sec] H_100(0)=3.037263e+93 by HermiteCoeffMemo1 object
eval. time:  0.003563[sec] H_100(0)=3.037263e+93 by HermiteCoeffMemo2 object

When dtype is set to np.int, it is a completely random value, so it can be imagined that it is still overflowing. Also, since the speed of the homemade cache mechanism is almost the same with lru_cache, it is better to use the existing lru_cache.

The result of $ n = 250 $ is also included for comparison next (although the mantissa is 100 digits, so it deviates from the new value by about 0.1 ...).

$ python hermite-poly-coeff.py
eval. time:  0.005363[sec] H_250(0)=0.000000e+00 by HermiteCoeffMemo0 int64
eval. time:  0.012851[sec] H_250(0)=-1.673543e+283 by HermiteCoeffMemo1 object
eval. time:  0.013394[sec] H_250(0)=-1.673543e+283 by HermiteCoeffMemo2 object

polynomial manipulation packages

In the simple implementation of the previous chapter, it was implemented by hand, but since it has no versatility and the demand for algebraic operations of polynomials is extremely large, it would be the redevelopment of the wheel to implement by itself. I actually consulted with google teacher

In python

numpy.polynomial
mpmath.polynomials in julia
Polynomials By using the package of, it is possible to realize polynomial algebraic operation without implementing it by yourself.

Revelation is received. You can expect to improve the naive implementation (but it doesn't work). In addition, sympy and wolfram Language that enable Symbolic calculations guide / PolynomialAlgebra.html) can be implemented more simply (discussed later).

numpy.polynomial

Even if you read Document numpy.polynomial.polynomial.Polynomial I'm not sure, but

>>> import numpy as np 
>>> from numpy.polynomial.polynomial import Polynomial
>>> f = Polynomial([1,2,3])                                                                                         
Polynomial([1., 2., 3.], domain=[-1,  1], window=[-1,  1])
>>> f.coef                                                                                                      
array([1., 2., 3.])

Then, the polynomial class of $ P (x) = 1 + 2x + 3x ^ 2 $ is returned. [1,2,3] was assigned as an integer for the elements of the coefficient list, but each element has a floating point number. Since this may potentially overflow, we would like to evaluate it with a multiple-precision integer. Although not written in the document

>>> f = Polynomial(np.array([1,2,3], dtype=object))                                                             
>>> f.coef                                                                                                      
array([1, 2, 3], dtype=object)

It looks good. Since memoization has the same speed even with a home-made cache mechanism, lru_cache is used.

import numpy as np
from numpy.polynomial.polynomial import Polynomial
from functools import lru_cache
import time
import mpmath
mpmath.mp.dps = 100

@lru_cache(maxsize=1000)
def HermitePoly(n, dtype=None):
    if n == 0:
        return Polynomial(np.array([1], dtype=dtype))
    elif n == 1:
        return Polynomial(np.array([0,2], dtype=dtype))
    else:
        xc = Polynomial(np.array([0,1], dtype=dtype) )
        return 2*xc*HermitePoly(n-1) -2*(n-1)*HermitePoly(n-2)


mpmath.mp.dps = 100
x = np.array(mpmath.linspace(1, 1, 100)) 
n = 250
for dtype in [np.int, np.float, object]: 
    time0 = time.time()
    H = HermitePoly(n, dtype=dtype)
    y = H(x)
    c = H.coef
    print(time.time()-time0, "[sec]", H(0), c.max(), c.dtype)
    HermitePoly.cache_clear()

$ python numpy-poly.py 
0.14997124671936035 [sec] 5.922810347657652e+296 2.46775722331116e+305 float64
0.15068888664245605 [sec] 5.922810347657652e+296 2.46775722331116e+305 float64
0.14925169944763184 [sec] 5.922810347657652e+296 2.46775722331116e+305 object

When int is specified in dtype, it is overridden by float, but the coefficient of dtype = object remains objecto, but if you think about it carefully, only int should appear in the coefficient, so c.max () floats. The decimal point is that the dtype is an object, but the actual element is rounded to a float. In fact, if you raise it to n = 260

$ python numpy-poly.py 
/home/hanada/Applications/anaconda3/lib/python3.7/site-packages/numpy/polynomial/polynomial.py:788: RuntimeWarning: invalid value encountered in double_scalars
  c0 = c[-i] + c0*x
0.14511513710021973 [sec] nan nan float64
0.14671945571899414 [sec] nan nan float64
0.14644980430603027 [sec] nan inf object

Therefore, a double-precision floating-point overflow occurs. For the time being, this problem can be solved by using mpmath

mpmath.mp.dps = 10000   
@lru_cache(maxsize=1000)
def HermitePoly1(n, dtype=None):
    if n == 0:
        return Polynomial(np.array([mpmath.mpf("1")]))
    elif n == 1:
        return Polynomial(np.array([mpmath.mpf("0"), mpmath.mpf("2")]))
    else:
        xc = Polynomial(np.array([mpmath.mpf("0"), mpmath.mpf("1")]))
        return 2*xc*HermitePoly1(n-1) -2*(n-1)*HermitePoly1(n-2)


x = np.array(mpmath.linspace(1, 1, 100)) 
n = 250

time0 = time.time()
H = HermitePoly1(n, dtype=dtype)
y = H(x)
c = H.coef
print(time.time()-time0, "[sec]", float(H(0)), c.max(), c.dtype)

If so, it's a little long

0.30547070503234863 [sec] -1.7171591075700597e+283 4742832273990616849979871677859751938138722866700194098725540674413432950130755339448602832297348736154189959265033164596368647282052560604201151158911400057598325345216981770764974144951365648905162496309065906209718571075651658676878296726900331786598665420800000000000000000000000000000000.0 object

It turns out that However, since the coefficients are also in floating point notation, probably in ABCPolyBase which is the parent class of numpy.polynomial.polynomial. Is it the influence of the internal implementation of? I haven't followed that far.

Also, the calculation speed is obviously faster if you implement it yourself, so you should avoid using numpy.polynomial.polynomial.Polynomial (starting with a capital letter at the end !!) in multiple precision calculations.

mpmath.polynomial

In the first place, numpy is a c language base, so you can't overdo it. Since mochi is a mochi shop, I will implement it in mpmath (but this also has a problem). In mpmath, mpmath (Orthogonal Polynomial System) provides Hermite polynomials, so you can use them, but in general cases Consider the extension in mpmath.polynomial and consider whether it can be simulated.

In the reverse order of for numpy, the coefficient list Note that if $ [c_0, c_1, c_2] $ is given, $ P (x) = c_0x ^ 2 + c_1x ^ 1 + c_2 $.

Let's evaluate with 1000 digits of the mantissa.

import mpmath as mp
mp.mp.dps = 1000
import time

When implemented naively

def HermitePoly1(n,x):
    if n == 0:
        return mp.polyval([1], x)
    elif n==1:
        return mp.polyval([2,0], x)
    else:
        xc = mp.polyval([1,0], x)        
    return 2*xc* HermitePoly1(n-1, x) - 2*(n-1)*HermitePoly1(n-2,x)

Is it? It seems that an array (list) cannot be assigned to x in polyval, so if you simply memoize it

known={0:lambda x: mp.polyval([1], x),
       1:lambda x: mp.polyval([2,0], x)}
def HermitePoly2(n):
    if n in known:
        return known[n]
    else:
        H = lambda x: 2*mp.polyval([1,0], x) * HermitePoly2(n-1)(x) - 2*(n-1)*HermitePoly2(n-2)(x)
        known[n] = H
        return H

Will it be

x = mp.linspace(-5,5,100,endpoint=False)

n = 20
time0 = time.time()
y = [HermitePoly1(n, xx) for xx in x]
print(time.time()-time0, "[sec]", HermitePoly1(n, 0))

time0 = time.time()
H = HermitePoly2(n)
y = [H(xx) for xx in x]
print(time.time()-time0, "[sec]", H(0))

Execute as. Result is

$ python mpmath-poly.py 
17.37535572052002 [sec] 670442572800.0
17.98806405067444 [sec] 670442572800.0

It cannot be speeded up. Because the lamda expression cache is just the cache in the order of evaluation, This is because the actual evaluation result is not cached. I haven't used mpmath.Polynomial so much, so there may be a better way, but mpmath.Polynomial is the memoization. I couldn't think of a way to do it. I'm curious about the internal implementation of mp.hermite.

In the case of Hermite polynomials, only the coefficients need to be evaluated multiple times, so if it is ad hoc,

from functools import lru_cache
import numpy as np

@lru_cache()
def HermiteCoeff(n):
    if n == 0:
        return np.array([1], dtype=object )
    elif n == 1:
        return np.array([0, 2], dtype=object)
    else: 
        coeff = np.zeros(n+1, dtype=object)        
        #2xH(n-1) - 2*(n-1)*H(n-2)
        coeff[:n-1] += -2*(n-1)*HermiteCoeff(n-2)
        coeff[1:] += 2*HermiteCoeff(n-1)
        return coeff

def Herm(coeff):
    return lambda x:sum(c*x**i for i, c in enumerate(coeff))  
    

time0 = time.time()
coeff = HermiteCoeff(250)
H = Herm(coeff)
coeff = coeff.tolist()[::-1]
y = [mp.polyval(coeff, xx) for xx in x]
print(time.time()-time0, "[sec]", float(mp.polyval(coeff, 0)))

If you do like

$ python mpmath-poly.py
0.19260430335998535 [sec] -1.7171591075700597e+283

It's not that you can't memoize.

sympy

Next, let's try using sympy.

import sympy, mpmath
import time
x = sympy.symbols('x')

def HermitePoly1(n):
    if n==0:
        return 1
    elif n==1:
        return x
    else:
        return 2*x*HermitePoly1(n-1) - 2*(n-1)*HermitePoly1(n-2)


known={0:1, 1:x}
def HermitePoly2(n):
    if n in known:
        return known[n]
    else:
        Hn = 2*x*HermitePoly2(n-1) - 2*(n-1)*HermitePoly2(n-2)
        known[n] = Hn
        return Hn

known1={0:1, 1:x}
def HermitePoly3(n):
    if n in known1:
        return known1[n]
    else:
        Hn = 2*x*HermitePoly3(n-1) - 2*(n-1)*HermitePoly3(n-2)
        known1[n] = sympy.expand(Hn)        
        return known1[n]

mpmath.mp.dps = 100
x1 = mpmath.linspace(0, 1, 100)

n = 20
time0 = time.time()
H = HermitePoly1(n)
time1 = time.time()
y = [H.subs(x, i) for i in x1]
print(time.time() - time1, time1 - time0, H.subs(x, 0))


time0 = time.time()
H = HermitePoly2(n)
time1 = time.time()
y = [H.subs(x, i) for i in x1]
print(time.time() - time1, time1 - time0, H.subs(x, 0))


time0 = time.time()
H = HermitePoly3(n)
time1 = time.time()
y = [H.subs(x, i) for i in x1]
print(time.time() - time1, time1 - time0, H.subs(x, 0))

If you run as

$ python sympy-poly.py 
1.612032413482666 0.1680293083190918 670442572800
1.5776398181915283 0.012787342071533203 670442572800
0.2140791416168213 0.07543253898620605 670442572800

I can memoize, but I haven't used sympy so much, so I'm not sure if this is the best.

Mathematica (wolfram language)

With Mathematica (wolfram language), you can define recurrence formulas for Hermite polynomials in the same way as Fibonacci series. Approximately 14 seconds for a rustic implementation

He[n_, x_] := 2x*He[n - 1, x] - 2 (n - 1)*He[n - 2, x] 
He[0, x_] = 1;
He[1, x_] = 2x; 

{First@Timing@Expand@He[30, x], "sec"}
Out[4]= {14.214, sec}

When the calculation result is cached

H[n_, x_] := H[n, x] = 2 x*H[n - 1, x] - 2 (n - 1)*H[n - 2, x] 
H[0, x_] = 1;
H[1, x_] = 2x; 

{First@Timing@Expand@H[30, x], "sec"}
Out[8]= {7.89031, sec}

It seems that Expand is taking a long time

h[n_, x_] := h[n, x] = 2x*h[n - 1, x] - 2 (n - 1)*h[n - 2, x] // Expand 
h[0, x_] = 1;
h[1, x_] = 2x; 

{First@Timing@Expand@h[30, x], "sec"}
Out[12]= {0.007158, sec}

It is even faster if you use HermiteH defined in Mathematica.

In[14]:= {First@Timing@HermiteH[30, x], "sec"}
Out[14]= {0.000216, sec}

What on earth are you doing inside?

Julia

Operations related to polynomial operations in python can be realized by using Polynomials.jl.

(I'm getting tired, so I'm preparing the program)