[PYTHON] [1 copy per day] Build a Stock Prediction Program [Daily_Coding

at first

――This article is a memorandum article for elementary school students who are self-taught in python, machine learning, etc. ――It will be extremely simple, "study while copying the code that you are interested in". ――We would appreciate your constructive comments (LGTM & stock if you like it).

Theme: Build a Stock Prediction Program

Today's topic is a video on Youtube called ** Build a Stock Prediction Program **.

Youtube: Build a Stock Prediction Program

The analysis used Google Colaboratry as shown in the youtube video.

Then I would like to do it.

Step1: Import library

First, import the library. This time we will use a library called quandl. quandl seems to be a library for fetching stock price and other data (I didn't know ...).

pip install quandl

Since it is not the library originally included in Google colab, install it with pip.

Next, import the library to be used this time.

Step1: Import library

import quandl
import numpy as np
from sklearn.linear_model import LinearRegression
from sklearn.svm import SVR
from sklearn.model_selection import train_test_split

In addition to the quandl I installed earlier, I imported numpy and various scikit-learn libraries.

Step2: Data acquisition-processing (pre-processing)

Then get the data. This time we are using Facebook stock price.

df = quandl.get('WIKI/FB')

print(df.head())

You have now obtained the data. Since ʻAdj. Close` (adjusted closing price) is used in the acquired data, replace df.

df = df[['Adj. Close']]
print(df.head())

Shift the data contained in the df by a few days to create another column (Prediction). At that time, "how many days to shift" is stored as a variable.

forecast_out = 30

df['Prediction'] = df[['Adj. Close']].shift(-forecast_out)

print(df.tail())

If you look at the end of df, you can see that the value of Prediction is NaN by the number of days shifted.

Next, create training data from df ['Predcition']. For the training data, we will use the part of the data created by shifting the data for several days (30 days this time), excluding NaN. I'm using it to predict the 30 days of shift.

X = np.array(df.drop(['Prediction'], 1))

X = X[:-forecast_out]![stockprediction.png]
print(X)

【image】

Next, create test data. The method is the same as the training data.

y = np.array(df['Prediction'])

y = y[:-forecast_out]
print(y)

The data is now processed. Next, we will move on to analysis using scikit-learn.

Step3: Prediction using Sklearn

Divide the training data (X) and test data (y) by train_test_split of sklearn.

x_train, x_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

This time, we will use SVM and Linear Regression for prediction.

#SVM rbf(Nonlinear regression)
svr_rbf = SVR(kernel='rbf', C=1e3, gamma=0.1)
svr_rbf.fit(x_train, y_train)

svm_confidence = svr_rbf.score(x_test, y_test)
print('svm confidence:', svm_confidence)

lr = LinearRegression()
lr.fit(x_train, y_train)

lr_confidence = lr.score(x_test, y_test)
print('lr confidence:', lr_confidence)

You have now created a trained model using the training data.

Let's use it to make predictions

x_forecast = np.array(df.drop(['Prediction'], 1))
print(x_forecast)

lr_prediction = lr.predict(x_forecast)
print(lr_prediction)

svm_prediction = svr_rbf.predict(x_forecast)
print(svm_prediction)

This completes the prediction with the created model.

Finally

――As you understand, the above prediction is meaningless. I understand that studying how to use sklearn is the meaning of this sutra copying. ――Many papers have been published on forecasting stock prices and commodities, and time series analysis is very profound, so I would like to continue learning.

that's all.

(Learning so far)

[1 copy per day] Predict employee attrition [Daily_Coding_001]

[PYTHON] [1 copy per day] Build a Stock Prediction Program [Daily_Coding_002]

at first

Theme: Build a Stock Prediction Program

Step1: Import library

Step1: Import library

Step2: Data acquisition-processing (pre-processing)

Step3: Prediction using Sklearn

Finally