[Python] Precautions when finding the maximum and minimum values in a numpy array with a small number of elements

Backtesting FX Systre with Python As I mentioned in the article, of the technical indicators published on GitHub, only the parabolic SAR (iSAR) function took a long time. I was wondering if it could be helped because the algorithm is complicated, but it turned out that there was actually a problem in how to find the maximum and minimum values.

In this article, I will use sample code to summarize the problems.

example

Consider the problem of extracting 3 samples at a time from appropriate time series data and finding the maximum value in sequence. It looks like this when written in a formula.

y(n)=\max\\{x(n), x(n-1), x(n-2)\\}

Create the time series data as a random number sequence as follows.

import numpy as np
x = np.random.randint(1000, size=100000)

4 types of codes

Let's try 4 types of codes that calculate the maximum value of 3 samples as a time series. func1 Make the code as an expression. Pass x [i], x [i-1], x [i-2] as arguments to the Python built-in function max.

def func1(x):
    y = np.empty(len(x), dtype=int)
    for i in range(2,len(x)):
        y[i] = max(x[i], x[i-1], x[i-2])
    return y

func2 It's a bit like Python, using slices and passing it to the max argument.

def func2(x):
    y = np.zeros(len(x), dtype=int)
    for i in range(2,len(x)):
        y[i] = max(x[i-2:i+1])
    return y

func3 Since numpy also has a max function, try using np.max instead of max.

def func3(x):
    y = np.zeros(len(x), dtype=int)
    for i in range(2,len(x)):
        y[i] = np.max(x[i-2:i+1])
    return y

func4 Let's list the three elements x [i], x [i-1], x [i-2] and pass them to the argument of np.max.

def func4(x):
    y = np.zeros(len(x), dtype=int)
    for i in range(2,len(x)):
        y[i] = np.max([x[i], x[i-1], x[i-2]])
    return y

Comparison of execution time

Let's compare the execution times of the above four functions.

%timeit y1 = func1(x)
%timeit y2 = func2(x)
%timeit y3 = func3(x)
%timeit y4 = func4(x)
10 loops, best of 3: 91.6 ms per loop
1 loop, best of 3: 304 ms per loop
1 loop, best of 3: 581 ms per loop
1 loop, best of 3: 1.29 s per loop

func1 is the fastest, func2, func3, func4 and so on. func4 takes 14 times longer than func1.

Speed up with numba

With the same code, let's compare the speedup by numba.

from numba import jit

@jit
def func1(x):
    y = np.zeros(len(x), dtype=int)
    for i in range(2,len(x)):
        y[i] = max(x[i], x[i-1], x[i-2])
    return y

@jit
def func2(x):
    y = np.zeros(len(x), dtype=int)
    for i in range(2,len(x)):
        y[i] = max(x[i-2:i+1])
    return y

@jit
def func3(x):
    y = np.zeros(len(x), dtype=int)
    for i in range(2,len(x)):
        y[i] = np.max(x[i-2:i+1])
    return y

@jit
def func4(x):
    y = np.zeros(len(x), dtype=int)
    for i in range(2,len(x)):
        y[i] = np.max([x[i], x[i-1], x[i-2]])
    return y
%timeit y1 = func1(x)
%timeit y2 = func2(x)
%timeit y3 = func3(x)
%timeit y4 = func4(x)
1000 loops, best of 3: 365 µs per loop
1 loop, best of 3: 377 ms per loop
100 loops, best of 3: 4.33 ms per loop
1 loop, best of 3: 1.36 s per loop

If you pay attention to the unit of time and compare, func1 is µs, so it is overwhelmingly faster. The next fastest is func3. You can clearly see the effect of speeding up with numba. In comparison, func2 and func4 are slower than numba's effect.

As a result, the difference between func4 and func1 is 3700 times wider. After all, when finding the maximum value with a small number of elements, I found that passing each element individually to the built-in function max is the fastest.

Actually, in the amax of the numpy documentation, "maximum (a [0], a" [1]) is faster than amax (a, axis = 0). ", A comment like that was written.

Why iSAR was slow

Returning to the story, the reason iSAR was slow was because it was written like func4. When I wrote it like func1, it became dramatically faster. Due to the effect of numba, the slowest technical indicator so far has become one of the faster technical indicators.

I don't really know where Python has a bottleneck.

Recommended Posts

[Python] Precautions when finding the maximum and minimum values in a numpy array with a small number of elements
Function to extract the maximum and minimum values ​​in a slice with Go
Set the number of elements in a NumPy one-dimensional array to a power of 2 (0 padded)
Get the number of specific elements in a python list
Precautions when creating a two-dimensional array with all the same values
Output in the form of a python array
How to count the number of elements in Django and output to a template
[Python] A program that finds the minimum and maximum values without using methods
Get the size (number of elements) of UnionFind in Python
How to identify the element with the smallest number of characters in a Python list?
[Python] Leave only the elements that start with a specific character string in the array
Find the index of the maximum value (minimum value) of a multidimensional array
[Homology] Count the number of holes in data with Python
Extract array elements and indexes in descending order with numpy
A server that returns the number of people in front of the camera with bottle.py and OpenCV
Find the maximum value, minimum value, and number of elements when the integer N is on the first line of the standard input and N integers follow.
Count the number of Thai and Arabic characters well in Python
[Python] Let's reduce the number of elements in the result in set operations
Get the number of readers of a treatise on Mendeley in Python
[Python] A program that finds the maximum number of toys that can be purchased with your money
How to write when you want to put a number after the group number to be replaced with a regular expression in re.sub of Python
Calculate the shortest route of a graph with Dijkstra's algorithm and Python
Count the number of times two values appear in a Python 3 iterator type element at the same time
Get the number of searches with a regular expression. SeleniumBasic VBA Python
Generate a list packed with the number of days in the current month.
python> array> Determine the number and initialize> mylist = [idx for idx in range (10)] / mylist = [0 for idx in range (10)] >> mylist = [0] * 10
Check the in-memory bytes of a floating point number float in Python
[Python] Calculate the number of digits required when filling in 0s [Note]
[Python] How to use list 2 Reference of list value, number of elements, maximum value, minimum value
Receive a list of the results of parallel processing in Python with starmap
Get the number of articles accessed and likes with Qiita API + Python
Minimum knowledge required when dealing with "angles" and "coordinates" in Ruby + In what direction is Mr. B from the perspective of Mr. A? Algorithm
How to sort by specifying a column in the Python Numpy array.
[Python] Combine all the elements in the array
A story about calculating the speed of a small ball falling while receiving air resistance with Python and Sympy
[Python] Precautions when retrieving data by scraping and putting it in the list
plot the coordinates of the processing (python) list and specify the number of times in draw ()
Divides the character string by the specified number of characters. In Ruby and Python.
[Python] A program that calculates the number of updates of the highest and lowest records
Extract elements (using a list of indexes) in a NumPy style from a Python list / tuple
Get the stock price of a Japanese company with Python and make a graph
Get the last element of the array by splitting the string in Python and PHP
How to get a list of files in the same directory with python
Output the number of CPU cores in Python
[Python] Get the files in a folder with Python
Calculate the total number of combinations with python
Make a copy of the list in Python
Precautions when dealing with control structures in Python 2.6
Find the number of days in a month
Rewriting elements in a loop of lists (Python)
[Python numpy] Dynamically specify the index of the array
A discussion of the strengths and weaknesses of Python
[Python] Manipulating elements in a list (array) [Sort]
Automate background removal for the latest portraits in a directory with Python and API
Consideration when you can do a good job in 10 years with Python3 and Scala3.
The result of making a map album of Italy honeymoon in Python and sharing it
[Python] A program that finds the shortest number of steps in a game that crosses clouds
[Python] Change the text color and background color of a specific keyword in print output
Consolidate a large number of CSV files in folders with python (data without header)
A memo when checking whether the specified key exists in the defined dictionary with python
[Shell art] Only when it is a multiple of 3 and a number with 3 becomes stupid