I tried speeding up Python code including if statements with Numba and Cython

Introduction

Notes on speeding up Python code with Numba So, I found that Numba is effective in speeding up technical indicator functions using for statements, but there are other indicators that make heavy use of if statements.

One of them is Parabolic SAR. This is not a particularly unusual indicator, but rather popular. However, since the ascending mode and descending mode are switched and the step width changes, it cannot be described by the for statement alone. This was the last time I ported MetaTrader's technical indicators to Python.

This time is a memo when speeding up this.

Parabolic SAR Python code

import numpy as np
import pandas as pd
dataM1 = pd.read_csv('DAT_ASCII_EURUSD_M1_2015.csv', sep=';',
                     names=('Time','Open','High','Low','Close', ''),
                     index_col='Time', parse_dates=True)

def iSAR(df, step, maximum):
    last_period = 0
    dir_long = True
    ACC = step
    SAR = df['Close'].copy()
    for i in range(1,len(df)):
        last_period += 1    
        if dir_long == True:
            Ep1 = df['High'][i-last_period:i].max()
            SAR[i] = SAR[i-1]+ACC*(Ep1-SAR[i-1])
            Ep0 = max([Ep1, df['High'][i]])
            if Ep0 > Ep1 and ACC+step <= maximum: ACC+=step
            if SAR[i] > df['Low'][i]:
                dir_long = False
                SAR[i] = Ep0
                last_period = 0
                ACC = step
        else:
            Ep1 = df['Low'][i-last_period:i].min()
            SAR[i] = SAR[i-1]+ACC*(Ep1-SAR[i-1])
            Ep0 = min([Ep1, df['Low'][i]])
            if Ep0 < Ep1 and ACC+step <= maximum: ACC+=step
            if SAR[i] < df['High'][i]:
                dir_long = True
                SAR[i] = Ep0
                last_period = 0
                ACC = step
    return SAR

%timeit y = iSAR(dataM1, 0.02, 0.2)

The for statement is single, but it takes some time.

1 loop, best of 3: 1min 19s per loop

Speed up with Numba

First, let's speed up with Numba. Just change the pandas array to a numpy array and add @jit.

from numba import jit
@jit
def iSARjit(df, step, maximum):
    last_period = 0
    dir_long = True
    ACC = step
    SAR = df['Close'].values.copy()
    High = df['High'].values
    Low = df['Low'].values
    for i in range(1,len(SAR)):
        last_period += 1    
        if dir_long == True:
            Ep1 = High[i-last_period:i].max()
            SAR[i] = SAR[i-1]+ACC*(Ep1-SAR[i-1])
            Ep0 = max([Ep1, High[i]])
            if Ep0 > Ep1 and ACC+step <= maximum: ACC+=step
            if SAR[i] > Low[i]:
                dir_long = False
                SAR[i] = Ep0
                last_period = 0
                ACC = step
        else:
            Ep1 = Low[i-last_period:i].min()
            SAR[i] = SAR[i-1]+ACC*(Ep1-SAR[i-1])
            Ep0 = min([Ep1, Low[i]])
            if Ep0 < Ep1 and ACC+step <= maximum: ACC+=step
            if SAR[i] < High[i]:
                dir_long = True
                SAR[i] = Ep0
                last_period = 0
                ACC = step
    return SAR

%timeit y = iSARjit(dataM1, 0.02, 0.2)
1 loop, best of 3: 1.43 s per loop

It's about 55 times faster. There are few code fixes, so it's a decent result.

Accelerate with Cython

Next, try speeding up with Cython. I thought Cython was a hassle to set up, but with Jupyter notebook, it was fairly easy to install. However, since it uses an external compiler, you need to install Visual C ++. I had to match the version of Anaconda that I built, so I installed the following compiler this time.

Visual Studio Community 2015

The first is when you just set up Cython without changing the code.

%load_ext Cython
%%cython
cimport numpy
cimport cython
def iSAR_c0(df, step, maximum):
    last_period = 0
    dir_long = True
    ACC = step
    SAR = df['Close'].values.copy()
    High = df['High'].values
    Low = df['Low'].values
    for i in range(1,len(SAR)):
        last_period += 1    
        if dir_long == True:
            Ep1 = High[i-last_period:i].max()
            SAR[i] = SAR[i-1]+ACC*(Ep1-SAR[i-1])
            Ep0 = max([Ep1, High[i]])
            if Ep0 > Ep1 and ACC+step <= maximum: ACC+=step
            if SAR[i] > Low[i]:
                dir_long = False
                SAR[i] = Ep0
                last_period = 0
                ACC = step
        else:
            Ep1 = Low[i-last_period:i].min()
            SAR[i] = SAR[i-1]+ACC*(Ep1-SAR[i-1])
            Ep0 = min([Ep1, Low[i]])
            if Ep0 < Ep1 and ACC+step <= maximum: ACC+=step
            if SAR[i] < High[i]:
                dir_long = True
                SAR[i] = Ep0
                last_period = 0
                ACC = step
    return SAR
%timeit y = iSAR_c0(dataM1, 0.02, 0.2)

result

1 loop, best of 3: 1.07 s per loop

Cython is a little faster with the same code.

Next, when you add a variable type declaration with cdef.

%%cython
cimport numpy
cimport cython
def iSARnew(df, double step, double maximum):
    cdef int last_period = 0
    dir_long = True
    cdef double ACC = step
    cdef numpy.ndarray[numpy.float64_t, ndim=1] SAR = df['Close'].values.copy()
    cdef numpy.ndarray[numpy.float64_t, ndim=1] High = df['High'].values
    cdef numpy.ndarray[numpy.float64_t, ndim=1] Low = df['Low'].values
    cdef double Ep0, Ep1
    cdef int i, N=len(SAR)
    for i in range(1,N):
        last_period += 1    
        if dir_long == True:
            Ep1 = max(High[i-last_period:i])
            SAR[i] = SAR[i-1]+ACC*(Ep1-SAR[i-1])
            Ep0 = max([Ep1, High[i]])
            if Ep0 > Ep1 and ACC+step <= maximum: ACC+=step
            if SAR[i] > Low[i]:
                dir_long = False
                SAR[i] = Ep0
                last_period = 0
                ACC = step
        else:
            Ep1 = min(Low[i-last_period:i])
            SAR[i] = SAR[i-1]+ACC*(Ep1-SAR[i-1])
            Ep0 = min([Ep1, Low[i]])
            if Ep0 < Ep1 and ACC+step <= maximum: ACC+=step
            if SAR[i] < High[i]:
                dir_long = True
                SAR[i] = Ep0
                last_period = 0
                ACC = step
    return SAR
%timeit y = iSARnew(dataM1, 0.02, 0.2)

Result is

1 loop, best of 3: 533 ms per loop

was. It's about twice as fast. It may be faster if you tune it, but it can make your code less readable, so I'll leave it here.

Summary

In the case of only the for statement, Numba also has the effect of speeding up considerably, but if the if statement is also included, the effect will decrease. If you want to make it a little faster, you may want to use Cython, with some code modifications.

Recommended Posts

I tried speeding up Python code including if statements with Numba and Cython
A note on speeding up Python code with Numba
I installed and used Numba with Python3.5
I tried Jacobian and partial differential with python
I tried function synthesis and curry with python
Reading, displaying and speeding up gifs with python [OpenCV]
I tried fp-growth with python
I tried scraping with Python
I tried gRPC with Python
I tried scraping with python
I tried follow management with Twitter API and Python (easy)
I tried to make GUI tic-tac-toe with Python and Tkinter
Roughly speed up Python with numba
I played with PyQt5 and Python3
I tried running prolog with python 3.8.2.
I tried SMTP communication with Python
I tried to make a periodical process with Selenium and Python
I tried to find out if ReDoS is possible with Python
I tried to easily detect facial landmarks with python and dlib
Mayungo's Python Learning Episode 7: I tried printing with if, elif, else
What to do if ipython and python start up with different versions
I tried scraping Yahoo News with Python
I tried sending an email with python.
I tried non-photorealistic rendering with Python + opencv
I tried to get the authentication code of Qiita API with Python.
I tried a functional language with Python
I tried recursion with Python ② (Fibonacci sequence)
I tried to verify and analyze the acceleration of Python by Cython
I tried updating Google Calendar with CSV appointments using Python and Google APIs
#I tried something like Vlookup with Python # 2
[ES Lab] I tried to develop a WEB application with Python and Flask ②
I tried "smoothing" the image with Python + OpenCV
I tried hundreds of millions of SQLite with python
[Python] I introduced Word2Vec and played with it.
I tried web scraping using python and selenium
I tried "differentiating" the image with Python + OpenCV
I tried object detection using Python and OpenCV
I tried Flask with Remote-Containers of VS Code
I tried L-Chika with Raspberry Pi 4 (Python edition)
I tried playing with PartiQL and MongoDB connected
I tried to get CloudWatch data with Python
I tried using mecab with python2.7, ruby2.3, php7
I tried to output LLVM IR with Python
I tried "binarizing" the image with Python + OpenCV
I tried running faiss with python, Go, Rust
I tried to automate sushi making with python
I tried playing mahjong with Python (single mahjong edition)
I tried running Deep Floor Plan with Python 3.6.10.
I tried sending an email with SendGrid + Python
I tried to automate the article update of Livedoor blog with Python and selenium.
I tried various things with Python: scraping (Beautiful Soup + Selenium + PhantomJS) and morphological analysis.
Image processing with Python (I tried binarizing it into a mosaic art of 0 and 1)
Code review with machine learning Amazon Code Guru now supports Python so I tried it
I tried to compare the processing speed with dplyr of R and pandas of Python
I tried to read and save automatically with VOICEROID2 2
I tried pipenv and asdf for Python version control
I want to handle optimization with python and cplex
I tried to implement Minesweeper on terminal with python
I tried to get started with blender python script_Part 01
I tried to touch the CSV file with Python
I tried to draw a route map with Python