[Python] My stock price forecast [HFT]

1. Motivation

2. About board information

Selling quantity(ASK) Stock price Buy quantity(BID)
500 670
400 669
600 668
667 300
666 1,200
665 400
ASK Stock price BID
500 670
400 669
500 668
667 300
666 1,200
665 400
ASK Stock price BID
500 670
400 669
600 668
667 400
666 1,200
665 400

3. FI-2010 dataset

What is the FI-2010 dataset?

Data overview

data = pd.read_csv('Train_Dst_Auction_DecPre_CF_1.txt', 
                   header=None, delim_whitespace=True)
print(data.shape)
#=> (149, 47342)
plt.figure(figsize=(20,10))
plt.imshow(data, interpolation='nearest', vmin=0, vmax=0.75, 
           cmap='jet', aspect=data.shape[1]/data.shape[0])
plt.colorbar()
plt.grid(False)
plt.show()

save.png

lob = data.iloc[:40,0].values
lob_df = pd.DataFrame(lob.reshape(10,4), 
                      columns=['ask','ask_vol','bid','bid_vol'])
print(lob_df)
ask ask_vol bid bid_vol
0 0.2631 0.00392 0.2616 0.00663
1 0.2643 0.00028 0.2615 0.00500
2 0.2663 0.00165 0.2614 0.00500
3 0.2664 0.00500 0.2613 0.00043
4 0.2667 0.00039 0.2612 0.00646
5 0.2710 0.00700 0.2611 0.00200
6 0.2745 0.00200 0.2609 0.00199
7 0.2749 0.00487 0.2602 0.00081
8 0.2750 0.00300 0.2600 0.00197
9 0.2769 0.01000 0.2581 0.01321

4. Model

Training data and labels

Model architecture

5. Implementation

Data preprocessing

#Board information is in the first 40 lines. 29738 as the data of the fifth brand~Specify 47294.
lob = data.iloc[:40, 29738:47294].T.values
#Here, standardize by price and quantity.
lob = lob.reshape(-1,2)
lob = (lob - lob.mean(axis=0)) / lob.std(axis=0)
lob = lob.reshape(-1,40)
lob_df = pd.DataFrame(lob)
#Calculate the non-standardized midpoint.
lob_df['mid'] = (data.iloc[0,29738:47294].T.values + data.iloc[2,29738:47294].T.values) / 2
#Specify the parameters.
p = 50
k = 50
alpha = 0.0003
#Create a label from the midpoint based on the parameters.
lob_df['lt'] = (lob_df['mid'].rolling(window=k).mean().shift(-k)-lob_df['mid'])/lob_df['mid']
lob_df = lob_df.dropna()
lob_df['label'] = 0
lob_df.loc[lob_df['lt']>alpha, 'label'] = 1
lob_df.loc[lob_df['lt']<-alpha, 'label'] = -1
from sklearn.model_selection import train_test_split
from keras.utils import to_categorical
from keras.layers import Conv2D, Dense, Reshape, Input, LSTM
from keras import Model, backend
import tensorflow as tf
#Create training data.
X = np.zeros((len(lob_df)-p+1, p, 40, 1))
lob = lob_df.iloc[:,:40].values
for i in range(len(lob_df)-p+1):
    X[i] = lob[i:i+p,:].reshape(p,-1,1)
y = to_categorical(lob_df['label'].iloc[p-1:], 3)
print(X.shape, y.shape)
#=> (17457, 50, 40, 1) (17457, 3)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

Model building

tf.reset_default_graph()
backend.clear_session()

inputs = Input(shape=(p,40,1))
x = Conv2D(8, kernel_size=(1,2), strides=(1,2), activation='relu')(inputs)
x = Conv2D(8, kernel_size=(1,2), strides=(1,2), activation='relu')(x)
x = Conv2D(8, kernel_size=(1,10), strides=1, activation='relu')(x)
x = Reshape((p, 8))(x)
x = LSTM(8, activation='relu')(x)
x = Dense(16, activation='relu')(x)
outputs = Dense(3, activation='softmax')(x)

model = Model(inputs=inputs, outputs=outputs)
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

Let's learn!

epochs = 50
batch_size = 256
history = model.fit(X_train, y_train,
                    epochs=epochs,
                    batch_size=batch_size,
                    verbose=1,
                    validation_data=(X_test, y_test))

Epoch 100/100 13965/13965 [==============================] - 5s 326us/step - loss: 0.6526 - acc: 0.6808 - val_loss: 0.6984 - val_acc: 0.6595

save.png save.png

6. Consideration

Recommended Posts

[Python] My stock price forecast [HFT]
Python: Stock Price Forecast Part 2
Python: Stock Price Forecast Part 1
Stock Price Forecast 2 Chapter 2
Stock Price Forecast 1 Chapter 1
Python & Machine Learning Study Memo ⑦: Stock Price Forecast
Stock price forecast with tensorflow
Get stock price with Python
Stock price forecast using machine learning (scikit-learn)
Stock price forecast using deep learning (TensorFlow)
Download Japanese stock price data with python
Stock Price Forecast with TensorFlow (LSTM) ~ Stock Forecast Part 1 ~
[Python] Creating a stock price drawdown chart
Stock price forecast using machine learning (regression)
My Numpy (Python)
My sys (python)
My pyproj (python)
My pandas (python)
My str (python)
My pyautogui (python)
My PySide (Python)
My shutil (python)
My matplotlib (python)
My urllib (python)
My pyperclip (python)
My sklearn (python)
[My memo] python
My ConfigParser (Python)
My Webdriver (Python)
My arcpy (python)
My win32gui (Python)
My os (python)
Get stock price data with Quandl API [Python]
Stock Price Forecast Using Deep Learning (TensorFlow) -Part 2-
Stock price forecast by machine learning Numerai Signals
Let's do web scraping with Python (stock price)
Stock price forecast using deep learning [Data acquisition]
Stock Price Forecast with TensorFlow (Multilayer Perceptron: MLP) ~ Stock Forecast Part 2 ~
My python environment memo
My Beautiful Soup (Python)
[My memo] python -v / python -V
Python Tips (my memo)
Store the stock price scraped by Python in the DB
Cryptocurrency price fluctuation forecast
[Time series with plotly] Dynamic visualization with plotly [python, stock price]
Kaggle ~ House Price Forecast ② ~
Kaggle ~ Home Price Forecast ~
Programming history 1 month Extract NY Dow stock price with Python!
[Introduction to Systre] Stock price forecast; Monday is weak m (__) m
Stock price forecast by machine learning Let's get started Numerai
Bitcoin price monitor python script
My python data analytics environment
Scraping weather forecast with python
Stock Price Forecasting Using LSTM_1
Stock price data acquisition tips
Stock price forecast by machine learning is so true Numerai Signals
Get US stock price from Python with Web API with Raspberry Pi