About VAR based on Gujarati's Basic Econometrics (BE) I tried together. It is based on BE's Example 17.13 and Section 22.9. Most of the translations are 22.9, but if you don't have BE, it will be difficult to read, so I tried to reproduce Gujarati's expression as much as possible. Many of Gujarati's Econometrics-related books are used as textbooks at universities and graduate schools in Europe and the United States. It is one of the most reliable textbooks, with a clear description of what econometrics can and cannot do.

In addition, we will hold a Free Online Study Session (Linear Regression) on June 16, 2020. We hope you will join us.

Before learning VAR

Regression analysis mostly deals with a model consisting of one equation. It consists of one dependent variable and one or more explanatory variables. Such models emphasize getting Y predictions and averaging. If there is a relationship between cause and effect, then in such a model it would be from X to Y. However, in many situations it is considered meaningless to discuss the direction and relationship between cause and effect. However, there is also a phenomenon that Y is fixed by X and X is fixed by Y. Sometimes X and Y affect both directions at the same time. In this case, the distinction between the dependent variable and the explanatory variable does not make sense. In such a simultaneous equation model, each is fixed at the same time as a set of variables. In such a model there are one or more equations, and in such a model the variables that are dependent on each other are called endogenous variables and are random variables. On the other hand, variables that are not truly stochastic are exogenous variables or pre-determined variables. In BE, chapters 18 to 20 are explained in detail in the simultaneous equation model (18), the discrimination problem (19), and the method of simultaneous equations (20). Consider the prices of money supply (Q) and commodities (P). The price of a commodity and the amount of money supply are determined by the interaction of that commodity with the supply and demand curve. Therefore, we represent these curves linearly and add noise to them to model the interaction. Demand function $ Q_t ^ d = \ alpha_0 + \ alpha_1P_t + u_ {1t} $ Supply function $ Q_t ^ s = \ beta_0 + \ beta_1P_t + u_ {2t} $ Equilibrium condition $ Q_t ^ d = Q_t ^ s $ $ t $ is the time and $ \ alpha $, $ \ beta $ is the parameter. Both the demand function ($ Q_t ^ d ) and the supply function ( Q_t ^ s ) are composed of these variables, and the one that connects these two equations is the equilibrium condition ( Q_t ^ d = Q_t ^ s $). .. It's easy to imagine that P and Q are subordinate. $ Q_t ^ d $ will be affected by income, wealth and tastes. Such effects are included in $ u_ {1t} $. Also, $ Q_t ^ s $ is affected by weather, strikes, import and export restrictions, etc., which are also included in $ u_ {2t} $. Therefore, these variables are not independent. It is not appropriate to handle with OLS. Predetermined variables are exogenous variables, their delays, and delayed endogenous variables. These and the endogenous variables make up the structural equation. These have structural coefficients. However, it is difficult to estimate such a model, which can result in biased or inconsistent estimates. Therefore, a reduced form is created based on these. This equation has an induction factor. The structural coefficient is estimated from this induction coefficient. This is called the identification problem. These can be divided into moderate identification, over-identification, and indistinguishable.

Vector autoregressive model (VAR)

In simultaneous equations (simultaneous equations), or structural models, variables are treated as endogenous, some exogenous, or pre-determined variables that combine exogenous and delayed endogenous. Before estimating such a model, it is necessary to check whether the equations in the system are (accurately or excessively) identifiable. This identifiability is achieved by assuming that some of the given variables exist only in some equation.

This decision is often subjective and has been severely criticized by Christopher Sims. According to The Sims, if there is true simultaneity between a set of variables, they should all be treated equally. There must be no prior distinction between endogenous and exogenous variables. It is based on this idea that Sims developed the VAR model.

(17.14.1) and (17.14.2) are the current value of GDP in terms of the past value of money supply and the value of past GDP, and the present in terms of the past value of money supply and the past value of GDP. Describes the value of the money supply. There are no exogenous variables in this system.

Now let's examine the nature of the causal relationship between Canada's money supply and interest rates. The money supply equation consists of the past values of the money supply and the interest rate, and the interest rate equation consists of the past values of the money supply and the money supply. Both of these examples are examples of vector autoregressive models. The term autoregressive is due to the use of past values, or delayed values, for the dependent variable on the right. The term vector is due to the fact that we are dealing with a vector of two (or more) variables.

Using the six delayed values of each variable as independent variables for the Canadian money supply and interest rates, we will see later, the causal relationship between the money supply (M1) and interest rates (90-day corporate interest rate (R)). We cannot reject the hypothesis that there is, that is, M1 affects R and R affects M1. This situation is best used with VAR.

To illustrate how to estimate VAR, we will assume that each equation has k delays of M (measured by M1) and R. In this case, OLS estimates each of the following equations:

Where u is a stochastic error term, called impulse, innovation, or shock in the VAR language.

Before estimating (22.9.1) and (22.9.2), we need to determine the maximum delay length k. This is empirically determined. The data used are 40 observations from 1979.I to 1988.IV. Including many delayed values in each equation reduces the degree of freedom. There is also the possibility of multicollinearity. If the number of delays is too low, the specifications may be incorrect. One way to solve this problem is to use information criteria such as Akaike and Schwartz and select the model with the lowest of these criteria. Trial and error is inevitable.

The following data is copied from Table 17.5.

date=pd.date_range(start='1979/1/31',end='1988/12/31',freq='Q')

M1=[22175,22841,23461,23427,23811,23612.33,24543,25638.66,25316,25501.33,25382.33,24753,
    25094.33,25253.66,24936.66,25553,26755.33,27412,28403.33,28402.33,28715.66,28996.33,
    28479.33,28669,29018.66,29398.66,30203.66,31059.33,30745.33,30477.66,31563.66,32800.66,
    33958.33,35795.66,35878.66,36336,36480.33,37108.66,38423,38480.66]
R=[11.13333,11.16667,11.8,14.18333,14.38333,12.98333,10.71667,14.53333,17.13333,18.56667,
   21.01666,16.61665,15.35,16.04999,14.31667,10.88333,9.61667,9.31667,9.33333,9.55,10.08333,
   11.45,12.45,10.76667,10.51667,9.66667,9.03333,9.01667,11.03333,8.73333,8.46667,8.4,7.25,
   8.30,9.30,8.7,8.61667,9.13333,10.05,10.83333]

M1=(np.array(M1)).reshape(40,1)
R=(np.array(R)).reshape(40,1)
ts=np.concatenate([M1,R],axis=1)
tsd=pd.DataFrame(ts,index=date,columns={'M1','R'})
ts_r=np.concatenate([R,M1],axis=1)
tsd_r=pd.DataFrame(ts_r,index=date,columns={'R','M1'})
tsd.M1.plot()
tsd.R.plot()

First, we use four delay values (k = 4) for each variable and use statsmodels to estimate the parameters of the two equations. The samples are from 1979.I to 1988.IV, but the samples from 1979.I to 1987.IV are used for estimation and the last four observations are used to diagnose the optimized VAR prediction accuracy.

Here we assume that both M1 and R are stationary. Also, since both equations have the same maximum delay length, we use OLS for regression. Each estimated coefficient may not be statistically significant, probably due to multicollinearity, because it contains delays for the same variable. But overall, the model is significant from the results of the F-test.

model = VAR(tsd.iloc[:-4])

results = model.fit(4)
results.summary()

 Summary of Regression Results   
==================================
Model:                         VAR
Method:                        OLS
Date:           Wed, 06, May, 2020
Time:                     22:50:28
--------------------------------------------------------------------
No. of Equations:         2.00000    BIC:                    14.3927
Nobs:                     32.0000    HQIC:                   13.8416
Log likelihood:          -289.904    FPE:                    805670.
AIC:                      13.5683    Det(Omega_mle):         490783.
--------------------------------------------------------------------
Results for equation M1
========================================================================
           coefficient       std. error           t-stat            prob
------------------------------------------------------------------------
const      2413.827162      1622.647108            1.488           0.137
L1.M1         1.076737         0.201737            5.337           0.000
L1.R       -275.029144        57.217394           -4.807           0.000
L2.M1         0.173434         0.314438            0.552           0.581
L2.R        227.174784        95.394759            2.381           0.017
L3.M1        -0.366467         0.346875           -1.056           0.291
L3.R          8.511935        96.917587            0.088           0.930
L4.M1         0.077603         0.207888            0.373           0.709
L4.R        -50.199299        64.755384           -0.775           0.438
========================================================================

Results for equation R
========================================================================
           coefficient       std. error           t-stat            prob
------------------------------------------------------------------------
const         4.919010         5.424158            0.907           0.364
L1.M1         0.001282         0.000674            1.901           0.057
L1.R          1.139310         0.191265            5.957           0.000
L2.M1        -0.002140         0.001051           -2.036           0.042
L2.R         -0.309053         0.318884           -0.969           0.332
L3.M1         0.002176         0.001160            1.877           0.061
L3.R          0.052361         0.323974            0.162           0.872
L4.M1        -0.001479         0.000695           -2.129           0.033
L4.R          0.001076         0.216463            0.005           0.996
========================================================================

Correlation matrix of residuals
            M1         R
M1    1.000000 -0.004625
R    -0.004625  1.000000

Although the values of AIC and BIC are partially different, almost the same results as BE are obtained. First, let's look at the regression of M1. Delay 1 for M1 and delays 1 and 2 for R are statistically significant (5% level), respectively. Looking back at interest rates, M1 delays 1,2,4 and interest rate first-order delays are significant (5% level).

For comparison, the VAR results based on two delays for each endogenous variable are shown.

results = model.fit(2)
results.summary()

 Summary of Regression Results   
==================================
Model:                         VAR
Method:                        OLS
Date:           Wed, 06, May, 2020
Time:                     22:50:29
--------------------------------------------------------------------
No. of Equations:         2.00000    BIC:                    13.7547
Nobs:                     34.0000    HQIC:                   13.4589
Log likelihood:          -312.686    FPE:                    603249.
AIC:                      13.3058    Det(Omega_mle):         458485.
--------------------------------------------------------------------
Results for equation M1
========================================================================
           coefficient       std. error           t-stat            prob
------------------------------------------------------------------------
const      1451.976201      1185.593527            1.225           0.221
L1.M1         1.037538         0.160483            6.465           0.000
L1.R       -234.884748        45.522360           -5.160           0.000
L2.M1        -0.044661         0.155908           -0.286           0.775
L2.R        160.155833        48.528324            3.300           0.001
========================================================================

Results for equation R
========================================================================
           coefficient       std. error           t-stat            prob
------------------------------------------------------------------------
const         5.796432         4.338943            1.336           0.182
L1.M1         0.001091         0.000587            1.858           0.063
L1.R          1.069081         0.166599            6.417           0.000
L2.M1        -0.001255         0.000571           -2.199           0.028
L2.R         -0.223364         0.177600           -1.258           0.209
========================================================================

Correlation matrix of residuals
            M1         R
M1    1.000000 -0.054488
R    -0.054488  1.000000

Similarly, although the AIC and BIC values are partially different, almost the same results as BE are obtained. Here, in the money supply regression, we can see that both the first-order delay of the money supply and the delay of the interest rate term are statistically significant respectively. In the interest rate regression, the second-order delay in the money supply and the first-order delay in interest rates are significant.

Which is better, if you choose between four and two models with a number of delays? The amount of information of Akaike and Schwartz of the 4th model is 13.5683 and 14.3927, respectively, and the corresponding values of the 2nd model are 13.3058 and 13.7547. The lower the Akaike and Schwartz statistics, the better the model, so a concise model seems to be preferable. Again, the selection is a model that contains two delays for each endogenous variable.

VAR prediction

Select a model with two delays. Used to predict the values of M1 and R. The data are from 1979.I to 1989.IV, but the 1989 values are not used to estimate the VAR model. Now let's predict 1989.I, the value of M1 for the first quarter of 1989. The predicted value for 1989.I can be obtained as follows.

\hat{M_{1989I}}=1451.976201 +1.037538M_{1987IV}-0.044661M_{1987III}

-234.884748R_{1987IV}+160.155833R_{1987III}

mm=results.coefs_exog[0]+results.coefs[0,0,0]*tsd.iloc[-5,0]+results.coefs[1,0,0]*tsd.iloc[-6,0]+\
results.coefs[0,0,1]*tsd.iloc[-5,1]+results.coefs[1,0,1]*tsd.iloc[-6,1]

mm,M1[-4],mm-M1[-4],(mm-M1[-4])/M1[-4]

# (array([36995.50488527]),array([36480.33]),array([515.17488527]),array([0.01412199]))

Here, the coefficients are obtained from summary.report.

Using the appropriate values for M and R, we can see that the estimated quantity of money for the first quarter of 1988 is 36995 (Millions of Canadian dollars). The actual value of M in 1988 was 36480.33 (Millions of Canadian dollars). This is because the model overestimated the actual value by about 515 (millions of dollars). This is about 1.4% of the actual M in 1988. Of course, these estimates will vary depending on the number of delays in the VAR model.

Causality with VAR

Explain Y with X, and if Y changes when this X changes, it is said to have a Granger cause and effect. Let's use the grower causality tests of stats models to see if there is a causal relationship. Two endogenous variables and degree k are arguments. Tests whether the second column of the endogenous variable is the Granger causality of the first column. The null hypothesis of grangercausality tests is that the time series in the second column x2 does not cause the time series in the first column x1. Grange causality means that the past value of x1 has a statistically significant effect on the current value of x1 with the past value of x1 as the independent variable. If the p-value is below the desired significance level, we reject the null hypothesis that x2 does not cause x1 by Granger.

from statsmodels.tsa.stattools import grangercausalitytests
grangercausalitytests(tsd, 8)

Granger Causality
number of lags (no zero) 1
ssr based F test:         F=15.1025 , p=0.0004  , df_denom=36, df_num=1
ssr based chi2 test:   chi2=16.3610 , p=0.0001  , df=1
likelihood ratio test: chi2=13.6622 , p=0.0002  , df=1
parameter F test:         F=15.1025 , p=0.0004  , df_denom=36, df_num=1

Granger Causality
number of lags (no zero) 2
ssr based F test:         F=12.9265 , p=0.0001  , df_denom=33, df_num=2
ssr based chi2 test:   chi2=29.7702 , p=0.0000  , df=2
likelihood ratio test: chi2=21.9844 , p=0.0000  , df=2
parameter F test:         F=12.9265 , p=0.0001  , df_denom=33, df_num=2

Granger Causality
number of lags (no zero) 3
ssr based F test:         F=7.7294  , p=0.0006  , df_denom=30, df_num=3
ssr based chi2 test:   chi2=28.5987 , p=0.0000  , df=3
likelihood ratio test: chi2=21.1876 , p=0.0001  , df=3
parameter F test:         F=7.7294  , p=0.0006  , df_denom=30, df_num=3

Granger Causality
number of lags (no zero) 4
ssr based F test:         F=5.5933  , p=0.0021  , df_denom=27, df_num=4
ssr based chi2 test:   chi2=29.8309 , p=0.0000  , df=4
likelihood ratio test: chi2=21.7285 , p=0.0002  , df=4
parameter F test:         F=5.5933  , p=0.0021  , df_denom=27, df_num=4

Granger Causality
number of lags (no zero) 5
ssr based F test:         F=4.1186  , p=0.0077  , df_denom=24, df_num=5
ssr based chi2 test:   chi2=30.0318 , p=0.0000  , df=5
likelihood ratio test: chi2=21.6835 , p=0.0006  , df=5
parameter F test:         F=4.1186  , p=0.0077  , df_denom=24, df_num=5

Granger Causality
number of lags (no zero) 6
ssr based F test:         F=3.5163  , p=0.0144  , df_denom=21, df_num=6
ssr based chi2 test:   chi2=34.1585 , p=0.0000  , df=6
likelihood ratio test: chi2=23.6462 , p=0.0006  , df=6
parameter F test:         F=3.5163  , p=0.0144  , df_denom=21, df_num=6

Granger Causality
number of lags (no zero) 7
ssr based F test:         F=2.0586  , p=0.1029  , df_denom=18, df_num=7
ssr based chi2 test:   chi2=26.4190 , p=0.0004  , df=7
likelihood ratio test: chi2=19.4075 , p=0.0070  , df=7
parameter F test:         F=2.0586  , p=0.1029  , df_denom=18, df_num=7

Granger Causality
number of lags (no zero) 8
ssr based F test:         F=1.4037  , p=0.2719  , df_denom=15, df_num=8
ssr based chi2 test:   chi2=23.9564 , p=0.0023  , df=8
likelihood ratio test: chi2=17.8828 , p=0.0221  , df=8
parameter F test:         F=1.4037  , p=0.2719  , df_denom=15, df_num=8

granger causality tests perform four tests.

‘Params_ftest’ and ‘ssr_ftest’ use the F distribution. ‘Ssr_chi2test’ and ‘lrtest’ use a chi-square distribution. We have found that delays of up to 1-6 are causal to Granger with R of M, but with delays 7 and 8, there is no causal relationship between the two variables.

Next, let's look at the reverse relationship.

grangercausalitytests(tsd_r, 8)

Granger Causality
number of lags (no zero) 1
ssr based F test:         F=0.2688  , p=0.6073  , df_denom=36, df_num=1
ssr based chi2 test:   chi2=0.2912  , p=0.5894  , df=1
likelihood ratio test: chi2=0.2902  , p=0.5901  , df=1
parameter F test:         F=0.2688  , p=0.6073  , df_denom=36, df_num=1

Granger Causality
number of lags (no zero) 2
ssr based F test:         F=3.2234  , p=0.0526  , df_denom=33, df_num=2
ssr based chi2 test:   chi2=7.4237  , p=0.0244  , df=2
likelihood ratio test: chi2=6.7810  , p=0.0337  , df=2
parameter F test:         F=3.2234  , p=0.0526  , df_denom=33, df_num=2

Granger Causality
number of lags (no zero) 3
ssr based F test:         F=2.7255  , p=0.0616  , df_denom=30, df_num=3
ssr based chi2 test:   chi2=10.0844 , p=0.0179  , df=3
likelihood ratio test: chi2=8.9179  , p=0.0304  , df=3
parameter F test:         F=2.7255  , p=0.0616  , df_denom=30, df_num=3

Granger Causality
number of lags (no zero) 4
ssr based F test:         F=2.4510  , p=0.0702  , df_denom=27, df_num=4
ssr based chi2 test:   chi2=13.0719 , p=0.0109  , df=4
likelihood ratio test: chi2=11.1516 , p=0.0249  , df=4
parameter F test:         F=2.4510  , p=0.0702  , df_denom=27, df_num=4

Granger Causality
number of lags (no zero) 5
ssr based F test:         F=1.8858  , p=0.1343  , df_denom=24, df_num=5
ssr based chi2 test:   chi2=13.7504 , p=0.0173  , df=5
likelihood ratio test: chi2=11.5978 , p=0.0407  , df=5
parameter F test:         F=1.8858  , p=0.1343  , df_denom=24, df_num=5

Granger Causality
number of lags (no zero) 6
ssr based F test:         F=2.7136  , p=0.0413  , df_denom=21, df_num=6
ssr based chi2 test:   chi2=26.3608 , p=0.0002  , df=6
likelihood ratio test: chi2=19.5153 , p=0.0034  , df=6
parameter F test:         F=2.7136  , p=0.0413  , df_denom=21, df_num=6

Granger Causality
number of lags (no zero) 7
ssr based F test:         F=2.8214  , p=0.0360  , df_denom=18, df_num=7
ssr based chi2 test:   chi2=36.2076 , p=0.0000  , df=7
likelihood ratio test: chi2=24.4399 , p=0.0010  , df=7
parameter F test:         F=2.8214  , p=0.0360  , df_denom=18, df_num=7

Granger Causality
number of lags (no zero) 8
ssr based F test:         F=1.6285  , p=0.1979  , df_denom=15, df_num=8
ssr based chi2 test:   chi2=27.7934 , p=0.0005  , df=8
likelihood ratio test: chi2=20.0051 , p=0.0103  , df=8
parameter F test:         F=1.6285  , p=0.1979  , df_denom=15, df_num=8

Here, the null hypothesis is rejected at the 6th and 7th orders.

The results vary. One of the meanings of Granger's representation theorem is that two variables, Xt and Yt, are in a cointegration relationship, each individually I (1) and sum, and each individually unsteady. Then Xt may have Granger causal Yt, or Yt may have Granger causal Xt.

In this example, if M1 and R are individually cointegrations at I (1), then M1 has Granger causal R or R has Granger causal M1. That is, you must first check if the two variables are individually I (1) and then check if they are cointegrations. If this is not the case, the entire causal problem can be fundamentally suspected. Looking at M1 and R in practice, it's not clear if these two variables are cointegrations. Therefore, the consequences of Granger causality also vary.

Some issues with VAR modeling

Proponents of VAR emphasize the advantages of this method:

(1) The method is simple. You don't have to wonder which variables are endogenous and which are exogenous. All variables in VAR are endogenous.

(2) Prediction is easy. That is, the usual OLS method can be applied to each equation.

(3) The predictions obtained by this method are often superior to those obtained from more complex simultaneous equation models.

However, critics of VAR modeling point out the following issues:

Unlike the simultaneous equation model, the VAR model is theoretical. This is because I don't use much previous information (experience). In a simultaneous equation model, including or not including certain variables play an important role in identifying the model.
Due to the focus on forecasting, the VAR model is not very suitable for policy analysis.
The biggest practical challenge in VAR modeling is choosing the right delay length. Suppose you have a 3-variable VAR model and you decide to include 8 delays for each variable in each equation. Each equation has 24 delay parameters and constant terms, for a total of 25 parameters. Estimating many parameters, unless the sample size is large, reduces many degrees of freedom for all related problems.
Strictly speaking, in the m-variable VAR model, all m-variables (together) must be stationary. If not, the data must be transformed appropriately (for example, by first-order diffs). As Harvey points out, the results from the transformed data can be inadequate. He further states: "The usual method adopted by VAR supporters is to use levels even if some of these time series are non-stationary. In this case, the effect of unit roots on the distribution of estimators is important. To make matters worse, if your model contains a mixture of I (0) and I (1) variables, that is, stationary and non-stationary variables, converting the data is not easy.
Since each coefficient of the estimated VAR model is often difficult to interpret, proponents of this technique often estimate the so-called impulse response function (IRF). The IRF tracks the response of the dependent variable of the VAR system to investigate the effects of error terms such as u1 and u2 in (22.9.1) and (22.9.2). Suppose u1 in the M1 equation increases by one standard deviation. Such shocks or changes will change M1 now and in the future. However, since M1 appears in the regression of R, changes in u1 also affect R. Similarly, changing one standard deviation in u2 of the R equation affects M1. The IRF will track the effects of such shocks in the future. Researchers have questioned the usefulness of such IRF analysis, but it is central to VAR analysis.

This is the simple translation of BE. After this, it is written with reference to Vector autoregression.

Error term conditions

y_t=c+A_1y_{t-1}+A_1y_{t-2}+\cdots+A_1y_{t-p}+e_t

$ y_ {t-i} $ is the i-th order delay of y. c is a vector of degree k. $ A_i $ is a time-unbiased matrix of kxk. u is a vector of error terms of degree k.

1. $ E (e_t) = 0 $: The average value of the error terms is zero.
$ E (e_te_t ^ \ `) = \ Omega $: $ \ Omega $ is a covariance matrix of error terms.
$ E (e_te_ {t-k} ^ \ `) = 0 $: The correlation is zero with respect to time. There is no autocorrelation of error terms.

Degree of sum of variables

The order of the sum of all variables must be the same.

All variables are I (0).
All variables are I (d), d> 0. If the variable is a cointegration, the error correction model is included in the VAR. b. If the variable is not a cointegration, take the difference until it is steady. I (d)

Get data from FRED

Get long-term data from FRED and analyze it from a long-term perspective. The Canadian money supply uses MANMM101CAM189S and the interest rate uses IR3TCP01CAM156N.

start="1979/1"
end="2020/12"
M1_0 = web.DataReader("MANMM101CAM189S", 'fred',start,end)/1000000
R1_0 = web.DataReader("IR3TCP01CAM156N", 'fred',start,end)#IR3TIB01CAM156N
M1=M1_0.resample('Q').last()
R1=R1_0.resample('Q').last()
M1.plot()
R.plot()

ADF test

Examine the stationarity.

from statsmodels.tsa.stattools import adfuller
import pandas as pd
tsd=pd.concat([M1,R1],axis=1)
tsd.columns=['M1','R']

index=['ADF Test Statistic','P-Value','# Lags Used','# Observations Used']
adfTest = adfuller((tsd.M1), autolag='AIC',regression='nc')
dfResults = pd.Series(adfTest[0:4], index)
print('Augmented Dickey-Fuller Test Results:')
print(dfResults)

Augmented Dickey-Fuller Test Results:
ADF Test Statistic      -1.117517
P-Value                  0.981654
# Lags Used              5.000000
# Observations Used    159.000000
dtype: float64

Not surprisingly, the M1 follows a random walk. This is the same even if regression is set to c, ct, ctt.

adfTest = adfuller((tsd.R), autolag='AIC',regression='nc')
dfResults = pd.Series(adfTest[0:4], index)
print('Augmented Dickey-Fuller Test Results:')
print(dfResults)

Augmented Dickey-Fuller Test Results:
ADF Test Statistic      -4.082977
P-Value                  0.006679
# Lags Used              3.000000
# Observations Used    161.000000
dtype: float64

As a matter of course, R is a stationary process as it is. This is the same even if regression is set to c, ct, ctt.

So I took the logarithm of M1.

adfTest = adfuller((np.log(tsd.M1)), autolag='AIC',regression='ct')
dfResults = pd.Series(adfTest[0:4], index)
print('Augmented Dickey-Fuller Test Results:')
print(dfResults)

Augmented Dickey-Fuller Test Results:
ADF Test Statistic      -3.838973
P-Value                  0.014689
# Lags Used             14.000000
# Observations Used    150.000000
dtype: float64

The logarithm of M1 seems to have trend stationary.

Let's remove the trend.

# remove time trend
gap=np.linspace(np.log(M1.iloc[0]), np.log(M1.iloc[-1]), len(M1))
lnM1=np.log(M1)
lnM1.plot()

alnM1=lnM1.copy()
alnM1['a']=gap
alnM1=alnM1.iloc[:,0]-alnM1.a
alnM1.plot()

adfTest = adfuller(alnM1, autolag='AIC',regression='nc')
dfResults = pd.Series(adfTest[0:4], index)
print('Augmented Dickey-Fuller Test Results:')
print(dfResults)

Augmented Dickey-Fuller Test Results:
ADF Test Statistic      -1.901991
P-Value                  0.054542
# Lags Used             14.000000
# Observations Used    150.000000
dtype: float64

lnM has become a stationary process with the trend removed.

First, let's analyze along with BE.

tsd0=pd.concat([alnM1,R1],axis=1)
tsd0.columns=['alnM1','R']
tsd=pd.concat([lnM1,R1],axis=1)
tsd.columns=['lnM1','R']

model = VAR(tsd.iloc[:36])

results = model.fit(4)
results.summary()

  Summary of Regression Results   
==================================
Model:                         VAR
Method:                        OLS
Date:           Thu, 07, May, 2020
Time:                     11:57:17
--------------------------------------------------------------------
No. of Equations:         2.00000    BIC:                   -5.33880
Nobs:                     32.0000    HQIC:                  -5.88999
Log likelihood:           25.8004    FPE:                 0.00217196
AIC:                     -6.16328    Det(Omega_mle):      0.00132308
--------------------------------------------------------------------
Results for equation lnM1
==========================================================================
             coefficient       std. error           t-stat            prob
--------------------------------------------------------------------------
const           0.358173         0.225376            1.589           0.112
L1.lnM1         1.286462         0.194312            6.621           0.000
L1.R           -0.005751         0.001961           -2.933           0.003
L2.lnM1         0.025075         0.298562            0.084           0.933
L2.R            0.001647         0.002730            0.604           0.546
L3.lnM1        -0.278622         0.295859           -0.942           0.346
L3.R            0.006311         0.002814            2.243           0.025
L4.lnM1        -0.062508         0.195688           -0.319           0.749
L4.R           -0.004164         0.002222           -1.875           0.061
==========================================================================

Results for equation R
==========================================================================
             coefficient       std. error           t-stat            prob
--------------------------------------------------------------------------
const          38.199790        21.797843            1.752           0.080
L1.lnM1       -15.488358        18.793423           -0.824           0.410
L1.R            0.875018         0.189630            4.614           0.000
L2.lnM1         7.660621        28.876316            0.265           0.791
L2.R           -0.345128         0.263996           -1.307           0.191
L3.lnM1        35.719033        28.614886            1.248           0.212
L3.R            0.310248         0.272203            1.140           0.254
L4.lnM1       -31.044707        18.926570           -1.640           0.101
L4.R           -0.162658         0.214871           -0.757           0.449
==========================================================================

Correlation matrix of residuals
            lnM1         R
lnM1    1.000000 -0.135924
R      -0.135924  1.000000

Next, use the data with the trend removed.

model = VAR(tsd0.iloc[:36])

results = model.fit(4)
results.summary()

 Summary of Regression Results   
==================================
Model:                         VAR
Method:                        OLS
Date:           Thu, 07, May, 2020
Time:                     10:50:42
--------------------------------------------------------------------
No. of Equations:         2.00000    BIC:                   -5.31179
Nobs:                     32.0000    HQIC:                  -5.86298
Log likelihood:           25.3682    FPE:                 0.00223143
AIC:                     -6.13627    Det(Omega_mle):      0.00135930
--------------------------------------------------------------------
Results for equation alnM1
===========================================================================
              coefficient       std. error           t-stat            prob
---------------------------------------------------------------------------
const            0.031290         0.024819            1.261           0.207
L1.alnM1         1.237658         0.189124            6.544           0.000
L1.R            -0.005209         0.001840           -2.831           0.005
L2.alnM1         0.035479         0.288928            0.123           0.902
L2.R             0.001341         0.002650            0.506           0.613
L3.alnM1        -0.267898         0.285970           -0.937           0.349
L3.R             0.006273         0.002722            2.304           0.021
L4.alnM1        -0.092060         0.190650           -0.483           0.629
L4.R            -0.004456         0.002161           -2.062           0.039
===========================================================================

Results for equation R
===========================================================================
              coefficient       std. error           t-stat            prob
---------------------------------------------------------------------------
const            2.626115         2.588966            1.014           0.310
L1.alnM1       -18.059084        19.728553           -0.915           0.360
L1.R             0.945671         0.191924            4.927           0.000
L2.alnM1         7.182544        30.139598            0.238           0.812
L2.R            -0.342745         0.276454           -1.240           0.215
L3.alnM1        37.385646        29.831061            1.253           0.210
L3.R             0.319531         0.283972            1.125           0.260
L4.alnM1       -30.462525        19.887663           -1.532           0.126
L4.R            -0.141785         0.225455           -0.629           0.529
===========================================================================

Correlation matrix of residuals
            alnM1         R
alnM1    1.000000 -0.099908
R       -0.099908  1.000000

The results show almost the same characteristics, but improvements are seen in AIC and BIC.

Let's use recent data.

model = VAR(tsd0.iloc[-40:])

results = model.fit(4)
results.summary()

 Summary of Regression Results   
==================================
Model:                         VAR
Method:                        OLS
Date:           Thu, 07, May, 2020
Time:                     11:06:09
--------------------------------------------------------------------
No. of Equations:         2.00000    BIC:                   -12.2865
Nobs:                     36.0000    HQIC:                  -12.8019
Log likelihood:           151.245    FPE:                2.13589e-06
AIC:                     -13.0783    Det(Omega_mle):     1.36697e-06
--------------------------------------------------------------------
Results for equation alnM1
===========================================================================
              coefficient       std. error           t-stat            prob
---------------------------------------------------------------------------
const            0.019669         0.012024            1.636           0.102
L1.alnM1         0.706460         0.175974            4.015           0.000
L1.R            -0.015862         0.008523           -1.861           0.063
L2.alnM1        -0.046162         0.185186           -0.249           0.803
L2.R            -0.020842         0.011837           -1.761           0.078
L3.alnM1         0.568076         0.186205            3.051           0.002
L3.R             0.035471         0.011813            3.003           0.003
L4.alnM1        -0.461882         0.175777           -2.628           0.009
L4.R            -0.007579         0.009849           -0.769           0.442
===========================================================================

Results for equation R
===========================================================================
              coefficient       std. error           t-stat            prob
---------------------------------------------------------------------------
const           -0.053724         0.308494           -0.174           0.862
L1.alnM1         1.054672         4.515026            0.234           0.815
L1.R             0.875299         0.218682            4.003           0.000
L2.alnM1        -5.332917         4.751384           -1.122           0.262
L2.R             0.257259         0.303711            0.847           0.397
L3.alnM1         3.412184         4.777534            0.714           0.475
L3.R            -0.263699         0.303088           -0.870           0.384
L4.alnM1         4.872672         4.509976            1.080           0.280
L4.R             0.032439         0.252706            0.128           0.898
===========================================================================

Correlation matrix of residuals
            alnM1         R
alnM1    1.000000 -0.168029
R       -0.168029  1.000000

The result before trend removal is

 Summary of Regression Results   
==================================
Model:                         VAR
Method:                        OLS
Date:           Thu, 07, May, 2020
Time:                     11:07:32
--------------------------------------------------------------------
No. of Equations:         2.00000    BIC:                   -12.3214
Nobs:                     38.0000    HQIC:                  -12.5991
Log likelihood:           144.456    FPE:                2.90430e-06
AIC:                     -12.7524    Det(Omega_mle):     2.26815e-06
--------------------------------------------------------------------
Results for equation lnM1
==========================================================================
             coefficient       std. error           t-stat            prob
--------------------------------------------------------------------------
const           0.079744         0.100134            0.796           0.426
L1.lnM1         0.784308         0.174023            4.507           0.000
L1.R           -0.016979         0.009977           -1.702           0.089
L2.lnM1         0.211960         0.174036            1.218           0.223
L2.R            0.012038         0.009846            1.223           0.221
==========================================================================

Results for equation R
==========================================================================
             coefficient       std. error           t-stat            prob
--------------------------------------------------------------------------
const          -1.450824         2.077328           -0.698           0.485
L1.lnM1         0.736725         3.610181            0.204           0.838
L1.R            0.884364         0.206971            4.273           0.000
L2.lnM1        -0.617456         3.610443           -0.171           0.864
L2.R           -0.027052         0.204257           -0.132           0.895
==========================================================================

Correlation matrix of residuals
            lnM1         R
lnM1    1.000000 -0.260828
R      -0.260828  1.000000

[PYTHON] Introduction to Vector Autoregressive Models (VAR) with stats models