[PYTHON] Finding the beginning of Abenomics from NT magnification 1

Pairs Trade using NT magnification

Pairs Trade is a well-known trading method that uses two stocks that behave similarly. When there is a difference between the two movements, it is a method of trying to make a profit by taking a position that fills the difference, and it is also done by hedge funds.

The method is explained in detail in the recently read "Time-series data analysis that can be used immediately in the field", and theoretically, it seems that the relationship between the two cointegrations is a prerequisite for establishing pair trading. At first, I wanted to do a Back Test to see if I could make a profit. In Japan, the NT magnification is taken up as a material for pair trading, but as some of you may know, the ratio of the two indexes, Nikkei225 and Topix, is called the NT magnification (= Nikkei225 / Topix). The question arises as to whether the combination of indexes is "theoretical", but I decided to look at it as a popular method.

In the process of investigating, I saw a big change in Trend, so I thought about it under the title of "Investigating the beginning of Abenomics".

Historical Chart and Scatter Plot

First is the observation of time series data.

TL_NT(raw).png

Both Topix and Nikkei225 are stock indexes calculated from the stock prices of the first section of the Tokyo Stock Exchange, so the movements are very similar. The y-axis scale is different in the Chart (Nikkei225-left, Topix-right), but the lines almost overlap from 2005 to 2009. You can also see the Trend where the Nikkei 225 has moved up from the Topix line since around 2010.

Next, let's look at Scatter Plot.

Scatter_NT_01.png

In this Chart, we can see that the plot is along the neighborhood of two straight lines. Considering this together with the first plot and Historical Chart, it can be inferred that the straight line with the smaller slope was the trend of NT at first, but over time, it changed to the trend with the larger slope. Based on this idea, we then performed a regression analysis. By the way, the Historical Chart of the NT magnification itself is as follows.

TL_NT(ratio).png

Regression analysis with Stats Models

This time, we performed linear regression analysis, but the Python module used ** StatsModels **. From the time series data of (Topix, Nikkei225), data was extracted by setting two time intervals, and regression was performed for each. The time intervals are 2005 / B ~ 2008 / E and 2012 / B ~ 2014 / E. The former is the time before the "Lehman shock" occurs, and the latter is the era of so-called "Abenomics".

import statsmodels.api as sm
# ... pre-process ...

# Regression Analysis
index_s1 = pd.date_range(start='2005/1/1', end='2008/12/31', freq='B')
x1 = pd.DataFrame(index=index_s1); y1 = pd.DataFrame(index=index_s1)

x1['topix'] = mypair['topix']   # 2005 .. 2008/E
y1['n225'] = mypair['n225']

x1 = sm.add_constant(x1)
model1 = sm.OLS(y1[1:], x1[1:])
mytrend1 = model1.fit()
print mytrend1.summary()

index_s2 = pd.date_range(start='2012/1/1', end='2014/12/31', freq='B')
x2 = pd.DataFrame(index=index_s2); y2 = pd.DataFrame(index=index_s2)

x2['topix'] = mypair['topix']   # 2012 .. 2014/E
y2['n225'] = mypair['n225']

x2 = sm.add_constant(x2)
model2 = sm.OLS(y2[1:], x2[1:])
mytrend2 = model2.fit()
print mytrend2.summary()

# Plot fitted line
plt.figure(figsize=(8,5))
plt.plot(mypair['topix'], mypair['n225'], '.')
plt.plot(x1.iloc[1:,1], mytrend1.fittedvalues, 'r-', lw=1.5, label='2005-2008E')
plt.plot(x2.iloc[1:,1], mytrend2.fittedvalues, 'g-', lw=1.5, label='2012-2014E')
plt.grid(True)
plt.legend(loc=0)

After setting each data in the DataFrame object called mypair [['topix','n225']] in pandas, the above list was executed.

Scatter_NT_02.png

I was able to fit the two straight lines neatly. Although it is an evaluation of regression analysis, it shows a number that exceeds expectations. (Both regression calculations have good results, but one result is shown.)

>>> print mytrend1.summary()
        OLS Regression Results (Early period, 2005-2008E)
==============================================================================
Dep. Variable:                   n225   R-squared:                       0.981
Model:                            OLS   Adj. R-squared:                  0.981
Method:                 Least Squares   F-statistic:                 5.454e+04
Date:                Sun, 21 Jun 2015   Prob (F-statistic):               0.00
Time:                        17:11:26   Log-Likelihood:                -7586.3
No. Observations:                1042   AIC:                         1.518e+04
Df Residuals:                    1040   BIC:                         1.519e+04
Df Model:                           1
==============================================================================
                 coef    std err          t      P>|t|      [95.0% Conf. Int.]
------------------------------------------------------------------------------
const        -45.0709     62.890     -0.717      0.474      -168.477    78.335
topix         10.0678      0.043    233.545      0.000         9.983    10.152
==============================================================================
Omnibus:                       71.316   Durbin-Watson:                   0.018
Prob(Omnibus):                  0.000   Jarque-Bera (JB):               30.237
Skew:                          -0.189   Prob(JB):                     2.72e-07
Kurtosis:                       2.256   Cond. No.                     8.42e+03
==============================================================================
Warnings:
[1] The condition number is large, 8.42e+03. This might indicate that there are
strong multicollinearity or other numerical problems.

The coefficient of determination R-squired is 0.981, F-stat 5.454e + 04 p-value 0.00 (!). As mentioned above, the conditional numer is too large from the program, isn't it strange? of Warning has occurred. (This time, I haven't pursued this Warning deeply ... Please give me any advice.)

The equations of the two Trend Lines obtained from this regression analysis (coef) are as follows. Period-1(2005-2008) : n225 = topix x 10.06 - ** 45.07 ** Period-2(2012-2014) : n225 = topix x 12.81 - 773.24

It can be said that the index of NT magnification has changed from ** 10.06 ** to ** 12.81 **. However, one question is why this big change is not visible in NT's Historical Chart.

TL_NT(ratio)_2.png

Since the intercept changed from Period-1 to Period-2, I also searched for'ntr2'(green line) that eliminated the effect, but it was said that NT jumped in a staircase pattern (from 10.06 to 12.81). It doesn't seem to be the case.

The conclusion from the analysis so far is that ** "Abenomics began between 2009 and 2012 (early)" **. It's not a neat result, but it's natural (from this result) to think that this economic trend had already begun, at least in December 2012, when Prime Minister Abe took office as Prime Minister for the second term.

(Since the conclusion is not clear, I did the classification using scikit-learn next. See the article "2".)

References

-"Time-series data analysis that can be used immediately in the field" (Daisuke Yokouchi (Author), Yoshimitsu Aoki (Author), Gijutsu-Hyoronsha) http://gihyo.jp/book/2014/978-4-7741-6301-7

Recommended Posts

Finding the beginning of Abenomics from NT magnification 2
Finding the beginning of Abenomics from NT magnification 1
Learning notes from the beginning of Python 1
Omit BOM from the beginning of the string
Learning notes from the beginning of Python 2
The beginning of cif2cell
Learn Nim with Python (from the beginning of the year).
Study from the beginning of Python Hour1: Hello World
Mathematical understanding of principal component analysis from the beginning
Study from the beginning of Python Hour8: Using packages
DJango Note: From the beginning (simplification and splitting of URLConf)
DJango Memo: From the beginning (preparation)
Existence from the viewpoint of Python
Carefully derive the interquartile range of the standard normal distribution from the beginning
DJango Memo: From the beginning (model settings)
[Understanding in 3 minutes] The beginning of Linux
Shout Hello, Reiwa! At the beginning of Reiwa
DJango Note: From the beginning (form processing)
Get the contents of git diff from python
DJango Memo: From the beginning (creating a view)
Extract only complete from the result of Trinity
DJango Memo: From the beginning (Error screen settings)
From the introduction of pyethapp to the execution of contract
The transition of baseball as seen from the data
The story of moving from Pipenv to Poetry
Summary from the beginning to Chapter 1 of the introduction to design patterns learned in the Java language
The story of launching a Minecraft server from Discord
The wall of changing the Django service from Python 2.7 to Python 3
Calculate volume from the two-dimensional structure of a compound
[Python] Get the text of the law from the e-GOV Law API
Open Chrome version of LINE from the command line [Linux]
Calculation of the minimum required number of votes from turnout
The idea of Tensorflow learned from potato chip manufacturing
Get the return code of the Python script from bat
Python points from the perspective of a C programmer
DJango Note: From the beginning (using a generic view)
DJango Note: From the beginning (creating a view from a template)
Othello ~ From the tic-tac-toe of "Implementation Deep Learning" (4) [End]
Visualize the number of complaints from life insurance companies
Find the "minimum passing score" from the "average score of examinees", "average score of successful applicants", and "magnification" of the entrance examination