[Python] Try optimizing FX systole parameters with a genetic algorithm

[Python] Try optimizing FX systole parameters with random search

It is a continuation of. Let's implement a genetic algorithm (GA) instead of a random search.

Preparation

The creation of hourly data is the same as last time.

import numpy as np
import pandas as pd
import indicators as ind #indicators.Import of py
from backtest import Backtest,BacktestReport

dataM1 = pd.read_csv('DAT_ASCII_EURUSD_M1_2015.csv', sep=';',
                     names=('Time','Open','High','Low','Close', ''),
                     index_col='Time', parse_dates=True)
dataM1.index += pd.offsets.Hour(7) #7 hour offset
ohlc = ind.TF_ohlc(dataM1, 'H') #Creation of 1-hour data

You need indicators.py and backtest.py uploaded on GitHub. For backtest.py, there is a slight fix in BacktestReport.

Trading system to optimize

This time, in order to increase the combination of parameter values to be optimized, we will add a signal for settlement to the crossing system of two moving averages. The signal for payment is defined as follows.

In this system, there are 3 parameters. This time, we will search each parameter in the following range.

SlowMAperiod = np.arange(7, 151) #Range of long-term moving average period
FastMAperiod = np.arange(5, 131) #Range of short-term moving averages
ExitMAperiod = np.arange(3, 111) #Range of moving average period for settlement

There are about 2 million combinations. It's possible to brute force, but it will take several hours.

Main routine

The main routine of the genetic algorithm is almost the same as the previous random search. Simply replace the random search for parameters with the genetic processing described below. In addition, the trading signal adds a signal for settlement as in the above rule.

def Optimize(ohlc, Prange):
    def shift(x, n=1): return np.concatenate((np.zeros(n), x[:-n])) #Shift function

    SlowMA = np.empty([len(Prange[0]), len(ohlc)]) #Long-term moving average
    for i in range(len(Prange[0])):
        SlowMA[i] = ind.iMA(ohlc, Prange[0][i])

    FastMA = np.empty([len(Prange[1]), len(ohlc)]) #Short-term moving average
    for i in range(len(Prange[1])):
        FastMA[i] = ind.iMA(ohlc, Prange[1][i])
    
    ExitMA = np.empty([len(Prange[2]), len(ohlc)]) #Moving average for payment
    for i in range(len(Prange[2])):
        ExitMA[i] = ind.iMA(ohlc, Prange[2][i])
    
    Close = ohlc['Close'].values #closing price
    
    M = 20 #Population
    Eval = np.zeros([M, 6])  #Evaluation item
    Param = InitParam(Prange, M) #Parameter initialization
    gens = 0 #Number of generations
    while gens < 100:
        for k in range(M):
            i0 = Param[k,0]
            i1 = Param[k,1]
            i2 = Param[k,2]
            #Buy entry signal
            BuyEntry = (FastMA[i1] > SlowMA[i0]) & (shift(FastMA[i1]) <= shift(SlowMA[i0]))
            #Sell entry signal
            SellEntry = (FastMA[i1] < SlowMA[i0]) & (shift(FastMA[i1]) >= shift(SlowMA[i0]))
            #Buy exit signal
            BuyExit = (Close < ExitMA[i2]) & (shift(Close) >= shift(ExitMA[i2]))
            #Sell exit signal
            SellExit = (Close > ExitMA[i2]) & (shift(Close) <= shift(ExitMA[i2]))
            #Backtest
            Trade, PL = Backtest(ohlc, BuyEntry, SellEntry, BuyExit, SellExit) 
            Eval[k] = BacktestReport(Trade, PL)
        #Alternation of generations
        Param = Evolution(Param, Eval[:,0], Prange)
        gens += 1
        print(gens, Eval[0,0])
    Slow = Prange[0][Param[:,0]]
    Fast = Prange[1][Param[:,1]]
    Exit = Prange[2][Param[:,2]]
    return pd.DataFrame({'Slow':Slow, 'Fast':Fast, 'Exit':Exit, 'Profit': Eval[:,0], 'Trades':Eval[:,1],
                         'Average':Eval[:,2],'PF':Eval[:,3], 'MDD':Eval[:,4], 'RF':Eval[:,5]},
                         columns=['Slow','Fast','Exit','Profit','Trades','Average','PF','MDD','RF'])

Another change from the last time is that the range of the three parameters SlowMAperiod, FastMAperiod, and ʻExitMAperiod is put together in a list called Prange` and passed to each function. By doing this, even if the number of parameters increases, it can be handled as it is.

Genetic processing

In the above function, the functions added for GA are ʻInitParam () and ʻEvolution (). First, ʻInitParam ()` is the initialization of the parameters of each individual.

from numpy.random import randint,choice

#Parameter initialization
def InitParam(Prange, M):
    Param = randint(len(Prange[0]), size=M)
    for i in range(1,len(Prange)):
        Param = np.vstack((Param, randint(len(Prange[i]), size=M)))
    return Param.T

ʻEvolution ()` contains some genetic processing as follows:

#Genetic processing
def Evolution(Param, Eval, Prange):
    #Roulette selection with elite storage
    #1 point crossing
    #Neighborhood generation
    #mutation
    return Param

Each process is explained below.

Roulette selection with elite storage

First, select the individual to be left for the next generation from the current individuals. There are several ways to select it, but I found a Numpy function that is useful for roulette selection, so I'll use that. Roulette selection is a method of selecting individuals that will be stochastically left for the next generation according to the magnitude of fitness, which is the evaluation value of the back test. The higher the fitness, the easier it is to remain.

The function used this time is a function called numpy.random.choice (), which randomly selects the required number from the list, but if you add a list of selection probabilities called p to the optional argument, that It will choose according to the probability. This is the roulette selection itself. The code looks like this:

    #Roulette selection with elite storage
    Param = Param[np.argsort(Eval)[::-1]] #sort
    R = Eval-min(Eval)
    R = R/sum(R)
    idx = choice(len(Eval), size=len(Eval), replace=True, p=R)
    idx[0] = 0 #Elite save
    Param = Param[idx]

However, it is inconvenient if the probability is negative, so it has been corrected so that the minimum value of fitness is 0. Also, if you only select roulette, you may not be unluckily selected even if the fitness is high, so we sort the fitness so that the highest individual (elite) will always remain in the next generation.

1 point crossing

Next, the genes are crossed. This is to select two individuals and exchange some of their genetic information with each other. There are several methods of crossing, but here I chose one of the parameters and exchanged the front and back.

    #1 point crossing
    N = 10
    idx = choice(np.arange(1,len(Param)), size=N, replace=False)
    for i in range(0,N,2):
        ix = idx[i:i+2]
        p = randint(1,len(Prange))
        Param[ix] = np.hstack((Param[ix][:,:p], Param[ix][:,p:][::-1]))

Again, use choice () to generate a sequence of random numbers for the number of intersecting individuals. By adding replace = False, you can get a unique sequence of random numbers. Then, crossing is realized by selecting two individuals as ʻixand exchanging the data in the latter half of the crossing pointp`.

Neighborhood generation

In normal GA, evolution is simulated by selection, crossing, and mutation, but if the crossing point is limited to the break of the parameter like this time, it will be full of the same individuals before you know it. Evolution will stop. However, if you increase the number of mutations, it will be close to a random search, so it is not very efficient. Therefore, this time, we will change some of the parameters to +1 or -1. It is a so-called neighborhood solution, which is often used in local search algorithms.

    #Neighborhood generation
    N = 10
    idx = choice(np.arange(1,len(Param)), size=N, replace=False)
    diff = choice([-1,1], size=N).reshape(N,1)
    for i in range(N):
        p = randint(len(Prange))
        Param[idx[i]][p:p+1] = (Param[idx[i]][p]+diff[i]+len(Prange[p]))%len(Prange[p])

Select an individual that creates a neighborhood as well as a cross. Then, decide which parameter to change with a random number again, and change that parameter by 1.

mutation

Finally, mutate. There are several ways to do this, but some of the parameters of the selected individual are rewritten with new random numbers. In the case of GA, mutation is important to get out of the local solution, but if you use it a lot, the randomness will increase, so here we will set it to about two.

    #mutation
    N = 2
    idx = choice(np.arange(1,len(Param)), size=N, replace=False)
    for i in range(N):
        p = randint(len(Prange))
        Param[idx[i]][p:p+1] = randint(len(Prange[p]))

Execution result

Let's execute the genetic algorithm using the function defined above.

result = Optimize(ohlc, [SlowMAperiod, FastMAperiod, ExitMAperiod])
result.sort_values('Profit', ascending=False)

GA also uses random numbers, so the results will be different each time. The following is an example of the results, showing the highest fitness for each generation. Since it is saved as an elite, the high fitness will be updated sequentially.

1 -94.9
2 958.2
3 958.2
4 958.2
5 1030.3
6 1030.3
7 1030.3
8 1454.0
9 1550.9
10 1550.9
11 1850.8
12 1850.8
13 1850.8
14 1850.8
15 1850.8
16 1850.8
17 2022.5
18 2076.5
19 2076.5
20 2076.5
:
61 2076.5
62 2076.5
63 2076.5
64 2076.5
65 2076.5
66 2316.2
67 2316.2
68 2316.2
69 2316.2
70 2316.2
:
95 2316.2
96 2316.2
97 2316.2
98 2316.2
99 2316.2
100 2316.2

The individuals that remained in the last generation are as follows.

Slow Fast Exit Profit Trades Average PF MDD RF
0 126 17 107 2316.2 75.0 30.882667 2.306889 387.1 5.983467
18 126 15 107 2316.2 75.0 30.882667 2.306889 387.1 5.983467
8 105 18 106 2210.2 76.0 29.081579 2.247080 387.1 5.709636
17 126 18 108 2130.9 75.0 28.412000 2.158098 424.9 5.015062
10 126 18 107 2078.4 79.0 26.308861 1.980794 448.3 4.636181
9 127 18 107 2074.5 73.0 28.417808 2.184819 371.3 5.587126
6 126 15 7 2030.3 76.0 26.714474 2.007143 415.7 4.884051
16 126 14 107 2024.9 76.0 26.643421 2.100489 424.9 4.765592
5 126 17 107 1954.7 74.0 26.414865 1.917441 448.3 4.360250
13 126 17 105 1878.7 79.0 23.781013 1.888694 414.2 4.535732
2 127 18 107 1872.4 75.0 24.965333 1.878813 448.3 4.176667
12 126 17 101 1869.6 76.0 24.600000 2.063300 420.4 4.447193
11 92 15 107 1859.5 73.0 25.472603 2.006223 358.8 5.182553
14 125 14 108 1843.1 84.0 21.941667 1.811938 473.6 3.891681
4 124 14 107 1839.8 75.0 24.530667 1.975245 420.4 4.376308
3 42 19 107 1796.8 75.0 23.957333 1.912405 410.7 4.374970
1 125 15 106 1614.7 81.0 19.934568 1.711729 386.9 4.173430
19 104 18 107 1583.7 94.0 16.847872 1.654746 393.4 4.025674
7 125 17 106 1421.7 81.0 17.551852 1.629015 574.4 2.475104
15 92 16 107 539.8 103.0 5.240777 1.150513 605.1 0.892084

I haven't investigated the optimal solution to this problem, so I don't know, but I think this result is reasonably high. In the first place, the purpose of GA is not to find the optimal solution, but to find the quasi-optimal solution in a shorter time than brute force.

In fact, finding the optimal solution with the parameters of the systole does not mean that the system will produce the same results over different periods of time. So, if you get a decent solution in 2000 trials out of 2 million combinations, you should be fine.

Recommended Posts

[Python] Try optimizing FX systole parameters with a genetic algorithm
[Python] Try optimizing FX systole parameters with random search
Try to solve the traveling salesman problem with a genetic algorithm (Python code)
FX automatic trading system made with python and genetic algorithm Part 1
Search the maze with the python A * algorithm
Try HTML scraping with a Python library
Try drawing a map with python + cartopy 0.18.0
Try to solve the traveling salesman problem with a genetic algorithm (Theory)
Try to draw a life curve with python
Try to make a "cryptanalysis" cipher with Python
Try to make a dihedral group with Python
Try to solve the traveling salesman problem with a genetic algorithm (execution result)
A * algorithm (Python edition)
Try embedding Python in a C ++ program with pybind11
Try running python in a Django environment created with pipenv
Python sample to learn XOR with genetic algorithm with neural network
Try programming with a shell!
Finding a solution to the N-Queen problem with a genetic algorithm (2)
Try Python output with Haxe 3.2
Try to bring up a subwindow with PyQt5 and Python
Backtesting FX Systre with Python (2)
Make a fortune with Python
[Python3] Dijkstra's algorithm with 14 lines
Try running Python with Try Jupyter
Try face recognition with Python
Create a directory with python
Finding a solution to the N-Queen problem with a genetic algorithm (1)
Solving the nurse scheduling problem (shift optimization) with a genetic algorithm
Find the optimal value of a function with a genetic algorithm (Part 2)
Try to create a python environment with Visual Studio Code & WSL
Try to extract a character string from an image with Python3
Try adding a wall to your IFC file with IfcOpenShell python
[Python] What is a with statement?
Solve ABC163 A ~ C with Python
A python graphing manual with Matplotlib.
Let's make a GUI with python.
Try to operate Facebook with Python
Try singular value decomposition with Python
Create a virtual environment with Python!
Write A * (A-star) algorithm in Python
I made a fortune with Python.
Building a virtual environment with Python 3
Solve ABC168 A ~ C with Python
Make a recommender system with python
Try face recognition with python + OpenCV
[Python] Generate a password with Slackbot
Solve ABC162 A ~ C with Python
Solve ABC167 A ~ C with Python
Implementing a simple algorithm in Python 2
Solve ABC158 A ~ C with Python
Let's make a graph with python! !!
Try frequency control simulation with Python
Implementation of Dijkstra's algorithm with python
Run a simple algorithm in Python
Getting Started with Python Genetic Algorithms
[Python] Inherit a class with class variables
I made a daemon with Python
Write a batch script with Python3.5 ~
[Cloudian # 3] Try to create a new object storage bucket with Python (boto3)
Calculate the shortest route of a graph with Dijkstra's algorithm and Python
Try to make a capture software with as high accuracy as possible with python (2)