Backtesting FX Systre with Python As I mentioned in the article, of the technical indicators published on GitHub, only the parabolic SAR (iSAR) function took a long time. I was wondering if it could be helped because the algorithm is complicated, but it turned out that there was actually a problem in how to find the maximum and minimum values.

In this article, I will use sample code to summarize the problems.

example

Consider the problem of extracting 3 samples at a time from appropriate time series data and finding the maximum value in sequence. It looks like this when written in a formula.

y(n)=\max\\{x(n), x(n-1), x(n-2)\\}

Create the time series data as a random number sequence as follows.

import numpy as np
x = np.random.randint(1000, size=100000)

4 types of codes

Let's try 4 types of codes that calculate the maximum value of 3 samples as a time series. func1 Make the code as an expression. Pass x [i], x [i-1], x [i-2] as arguments to the Python built-in function max.

def func1(x):
    y = np.empty(len(x), dtype=int)
    for i in range(2,len(x)):
        y[i] = max(x[i], x[i-1], x[i-2])
    return y

func2 It's a bit like Python, using slices and passing it to the max argument.

def func2(x):
    y = np.zeros(len(x), dtype=int)
    for i in range(2,len(x)):
        y[i] = max(x[i-2:i+1])
    return y

func3 Since numpy also has a max function, try using np.max instead of max.

def func3(x):
    y = np.zeros(len(x), dtype=int)
    for i in range(2,len(x)):
        y[i] = np.max(x[i-2:i+1])
    return y

func4 Let's list the three elements x [i], x [i-1], x [i-2] and pass them to the argument of np.max.

def func4(x):
    y = np.zeros(len(x), dtype=int)
    for i in range(2,len(x)):
        y[i] = np.max([x[i], x[i-1], x[i-2]])
    return y

Comparison of execution time

Let's compare the execution times of the above four functions.

%timeit y1 = func1(x)
%timeit y2 = func2(x)
%timeit y3 = func3(x)
%timeit y4 = func4(x)

10 loops, best of 3: 91.6 ms per loop
1 loop, best of 3: 304 ms per loop
1 loop, best of 3: 581 ms per loop
1 loop, best of 3: 1.29 s per loop

func1 is the fastest, func2, func3, func4 and so on. func4 takes 14 times longer than func1.

Speed up with numba

With the same code, let's compare the speedup by numba.

from numba import jit

@jit
def func1(x):
    y = np.zeros(len(x), dtype=int)
    for i in range(2,len(x)):
        y[i] = max(x[i], x[i-1], x[i-2])
    return y

@jit
def func2(x):
    y = np.zeros(len(x), dtype=int)
    for i in range(2,len(x)):
        y[i] = max(x[i-2:i+1])
    return y

@jit
def func3(x):
    y = np.zeros(len(x), dtype=int)
    for i in range(2,len(x)):
        y[i] = np.max(x[i-2:i+1])
    return y

@jit
def func4(x):
    y = np.zeros(len(x), dtype=int)
    for i in range(2,len(x)):
        y[i] = np.max([x[i], x[i-1], x[i-2]])
    return y

%timeit y1 = func1(x)
%timeit y2 = func2(x)
%timeit y3 = func3(x)
%timeit y4 = func4(x)

1000 loops, best of 3: 365 µs per loop
1 loop, best of 3: 377 ms per loop
100 loops, best of 3: 4.33 ms per loop
1 loop, best of 3: 1.36 s per loop

If you pay attention to the unit of time and compare, func1 is µs, so it is overwhelmingly faster. The next fastest is func3. You can clearly see the effect of speeding up with numba. In comparison, func2 and func4 are slower than numba's effect.

As a result, the difference between func4 and func1 is 3700 times wider. After all, when finding the maximum value with a small number of elements, I found that passing each element individually to the built-in function max is the fastest.

Actually, in the amax of the numpy documentation, "maximum (a [0], a" [1]) is faster than amax (a, axis = 0). ", A comment like that was written.

Why iSAR was slow

Returning to the story, the reason iSAR was slow was because it was written like func4. When I wrote it like func1, it became dramatically faster. Due to the effect of numba, the slowest technical indicator so far has become one of the faster technical indicators.

I don't really know where Python has a bottleneck.

[Python] Precautions when finding the maximum and minimum values in a numpy array with a small number of elements

example

4 types of codes

Comparison of execution time

Speed up with numba

Why iSAR was slow