Guided tour on Johansen's cointegration analysis

What is cointegration?

Cointegration is the phenomenon that each component Yi,t, i = 1,...,k, of a vector time series process Yt is a unit root process, possibly with drift, but certain linear combinations of the Yi,t's are stationary. Thus

Yt = Yt-1 + m + Vt,

where Vt is a zero-mean k-variate stationary time series process and m is a k-vector of drift parameters, but there exists a k ´ r matrix b with rank r < k such that b'Yt is (trend) stationary.

In order to show that this is possible, let us assume that Vt can be wriiten as an infinite order vector moving average process:

Vt = C(L)et,

where et is i.i.d. k-variate white noise with unit variance matrix and C(L) is a matrix-valued lag polynomial:

C(L) = C0 + C1L + C2L2 + C3L3 + ...........,

with the Cj's k ´ k coefficient matrices and L the backshift lag operator (i.e., Let = et-1). Now write C(L) as

C(L) = C(1) + [C(L) - C(1)] = C(1) + (1 - L)D(L),

where

D(L) = [C(L) - C(1)]/(1 - L).

This is always possible because C(L) - C(1) is a zero matrix for L = 1, hence each element of this lag polynomal matrix has root 1 and thus these elements have a common factor 1 - L. Next, denote

Wt = D(L)et.

Under some regularity conditions on the matrices Cj the Wt's are stationary.

We can now write

Yt = Yt-1 + m + C(1)et + Wt - Wt-1,

which by backward subtitution can be written as

Yt = Y0 - W0 + mt + C(1)[et + ... + e1] + Wt.

Apart from the term Y0 - W0 (which acts as a vector of intercepts) and the linear time trend mt, the nonstationarity of Yt is due to the term C(1)[et + ... + e1]. But if the matrix C(1) is singular, with rank k-r < k, then there exist r linear combinations of the rows of C(1) that are zero row vectors. The coefficients of these linear combinations form the k ´ r matrix b: b'C(1) = O. Thus

b'Yt = b'(Y0 - W0) + b'mt + b'Wt.

For t ® ¥ the random vector b'Wt becomes independent of b'(Y0 - W0), so that for large t the latter may be treated as a vector of constant intercepts. Then b'Yt is trend stationary because b'Wt is zero-mean stationary. The columns of the matrix b are called the cointegrating vectors, and the rank r of b is called the cointegration rank.

The Granger representation theorem

Clive Granger has shown that under some regularity conditions we can write a cointegrated process Yt as a Vector Error Correction Model (VECM):

DYt = p0 + p1t + P1DYt-1 + .... + Pp-1DYt-p+1 + ab'Yt-p + Ut,

where D is the difference operator (i.e., DYt = Yt - Yt-1), the Ut's are i.i.d. (0,S), and

p1 = ag1

for some r-vector g1. The latter condition is called cointegrating restrictions on the trend parameters, and is necessary because otherwise Yt would be vector unit root process with linear drift (which is rare in practice). Thus

DYt = p0 + P1DYt-1 + .... + Pp-1DYt-p+1 + a[g1t + b'Yt-p] + Ut.

Since DYt is stationary, we must have that g1t + b'Yt-p is stationary, hence b'Yt is trend stationary. Consequently, the time series involved have drift, and veer apart, like in the following picture:

TSPLOT window

When do we need a time trend in the VECM?

Most nonstationary macroeconomic time series such as the log of GDP have drift: they display a trending pattern with nonstationary fluctuations around the deterministic time trend. However, a VECM without a time trend,

DYt = p0 + P1DYt-1 + .... + Pp-1DYt-p+1 + ab'Yt-p + Ut,

is also able to generate drift in Yt because p0 acts (in some way) as a vector of drift parameters. Thus drift in the Yt process is no reason to include a time trend in the VECM.

You should include a time trend in the VECM if you suspect that:

Cointegrating restrictions on the intercept parameters of the VECM

Now suppose that in the case of a VECM without time trend,

DYt = p0 + P1DYt-1 + .... + Pp-1DYt-p+1 + ab'Yt-p + Ut,

there exists a vector g0 such that

p0 = ag0.

This condition is called cointegrating restrictions on the intercept parameters. Then the VECM can be written as

DYt = P1DYt-1 + .... + Pp-1DYt-p+1 + a[g0 + b'Yt-p] + Ut.

In this case the components of g0 + b'Yt are zero-mean stationary, hence DYt is zero-mean stationary, and therefore Yt has no drift! Consequently, cointegrating restrictions on the intercept parameters are only appropriate if the time series run approximately parallel without drift, like in the following picture:

TSPLOT window

If the time series run parallel and upwards sloping, like

TSPLOT window

you should not impose these restrictions.

Johansen's cointegration tests

Soren Johansen's approach is to estimate the VECM by maximum likelihood, under various assumptions about the trend or intercept parameters and the number r of cointegrating vectors, and then conduct likelihood ratio tests. Assuming that the VECM errors Ut are independent Nk[0,S] distributiod, and given the cointegrating restrictions on the trend or intercept parameters, the maximum likelihood Lmax(r) is a function of the cointegration rank r. Johansen proposes two types of tests for r:

Both tests have non-standard asymptotic null distributions. Moreover, given the cointegration rank r Johansen also derives likelihood ratio tests of the cointegrating restrictions on the intercept or trend parameters.

For further reading on cointegration, see my lecture notes on cointegration and the references to Johansen's papers at the end of this guided tour.

Cointegration analysis in practice

The data

The data are taken from the EasyReg data base, namely quarterly data on nominal consumption and nominal GDP for the US. The two time series are taken in logs (using the option Menu > Input > Transform variables).

Before you conduct cointegration analysis, you have to test whether the time series involved are unit root processes, using the option Menu > Data analysis > Unit root tests (root 1).

Also, always plot the data in order to determine whether there is drift, and whether you should include a time trend in the VECM. The data is plotted below:

TSPLOT window

Clearly, there is strong evidence of drift. The two series seem to run parallel at a distance, which suggests to use a VECM without trend and without cointegrating restrictions imposed on the intercept parameters.

Johansen's cointegration analysis in practice

Open Menu > Multiple equation models > Johansen's cointegration analysis, select the two time series, and select the option "Intercepts only, without cointegrating restrictions imposed":

COINTJ window

The two options to include seasonal dummy variables are only available if the time series are quarterly or monthly time series. For all other time series EasyReg will not show these options.

However, the time series involved are seasonally adjusted, so that there is no need for these options.

If you include a time trend in the VECM you can choose to impose the cointegrating restrictions on the time trend parameters or not. However, in both cases it is assumed that these cointegrating restrictions hold. The difference between these options is twofold:

  1. If you impose cointegrating restrictions on the trend parameters then the validity of these restrictions will be tested.
  2. The two options yield different null distributions of the test statistics involved, despite the fact that in both cases the restrictions are assumed to hold.

If you click References then the references to Johansen's papers (see below) will be displayed in the text box, and written to the output file if there is cointegration.

Click Option OK. The next window let you choose the order p of the VECM. EasyReg can do that automatically using the Hannan-Quinn and/or Schwarz information criteria. Since the Johansen approach is quite sensitive to the choice of p, I recommend to use this option. If so, the selected value of p is only an upper bound of the VECM order.

COINTJ window

The Hannan-Quinn criterion suggests to choose p = 4, and the Schwarz criterion suggests to choose p = 2. I recommend to choose the maximum of the two: p = 4:

COINTJ window

Click p OK:

COINTJ window

Click Continue:

COINTJ window

Here are the results for the lambda-max and trace tests. EasyReg will automatially analyze these results and make a recommendation about the cointegration order r. However, you should not blindly follow this suggestion, but draw your conclusion yourself. In this case however the suggested value r = 1 looks OK.

If you disagree with EasyReg, you can adjust the value of r.

Click Continue:

COINTJ window

Click Continue again for further options:

COINTJ window

If you leave the box "write output to file OUTPUT.TXT" checked then all the estimation and test results will be appended to the output file when you click Done/Other options.

We have now two options. Upon completion of these options you will return to this window.

The first option will be discussed next. Thus click Test parameter restrictions:

Testing parameter restrictions on the cointegrating vectors

COINTJ window

Suppose you want to test whether the cointegrating vector is b = (-1,1)' rather than (-0.9736191737,1)'. To do so, replace the value -0.9736191737 by -1, and click Vector H OK (H in EasyReg is the same as b above).

In general you have to replace all estimated entries (the numbers between -1 and 1) by fixed numbers, except for the 1-s themselves because they are due to normalisation. If the matrix H (alias b) has two or more columns, you may erase one or more columns but of course you should leave at least one column.

COINTJ window

The hypothesis is firmly rejected at both significance levels.

One way of double-checking this hypothesis is to form the linear combinations involved (using Menu > Input > Transform variables) and test whether they are unit root processes. In this case I have made a new variable, LN[nominal GDP]-LN[nominal consumption], and conducted various unit root and trend stationarity tests. It appears that this variable is a unit root process, so that (-1,1)' is not a cointegrating vector.

Click Again:

COINTJ window

Now do not change H, and click "Vector H is OK". Then the VECM will be re-estimated, and the parameter estimates will now be endowed with t and p values, except the parameters in the estimated cointegrating vector b. Since the latter estimates are super-consistent, we may treat them as the true values:

COINTJ window

You can now test the joint significance of the VECM parameters. To show that, click "Test of joint significance":

COINTJ window

Double-click the parameters you want to test, and click "Test joint significance". The procedure is the same as for VAR models, and will therefore not be further explained.

Click: "Continue". Then you will jump back to the following window:

COINTJ window

Innovation response analysis

Click "Innovation response analysis". Then you can conduct (non-structural) innovation response analysis, similar to the stationary VAR case:

COINTJ window

I have chosen an innovation response horizon of 20 periods, which corresponds to 5 years.

Click Start. Then the innovation responses are computed:

COINTJ window

The numerical values of the innovation responses are displayed in this window. Usually you will only be interested in the innovation response plots, so that then there is no need to write the numerical innovation responses to the output file. Therefore, you may leave the box "store output" unchecked.

Click Continue:

COINTJ window

Since some of the maximum likelihood parameter estimates have nonstandard asymptotic distributions, the innovation response plots are not endowed with standard error bands.

Another diffence with standard innovation response analysis is that in general the innovation responses do not taper off to zero. Thus, innovation shocks may induce permanent level shifts, as is apparent from the innovation response plot above.

The output


Johansen's cointegration analysis:

Dependent variables:
Y(1) = LN[nominal consumption]
Y(2) = LN[nominal GDP]

Characteristics:
LN[nominal consumption]
  First observation = 33(=1947.1)
  Last observation  = 223(=1994.3)
  Number of usable observations: 191
  Minimum value: 5.0536948E+000
  Maximum value: 8.4462341E+000
  Sample mean:   6.6648432E+000
LN[nominal GDP]
  First observation = 33(=1947.1)
  Last observation  = 223(=1994.3)
  Number of usable observations: 191
  Minimum value: 5.4236276E+000
  Maximum value: 8.8234566E+000
  Sample mean:   7.0953749E+000


Information criteria:
  p     Hannan-Quinn          Schwarz
  1     -1.85644E+01     -1.85034E+01
  2     -1.89468E+01     -1.88447E+01
  3     -1.89676E+01     -1.88242E+01
  4     -1.89808E+01     -1.87958E+01
  5     -1.89483E+01     -1.87214E+01
  6     -1.89272E+01     -1.86580E+01
  7     -1.88742E+01     -1.85625E+01
  8     -1.88427E+01     -1.84881E+01
  9     -1.88489E+01     -1.84512E+01
 10     -1.89132E+01     -1.84719E+01
 11     -1.88816E+01     -1.83964E+01
 12     -1.89056E+01     -1.83763E+01
p =                4                2
Remark: These estimates of p are only asymptotically correct.

Chosen VAR(p) order: p = 4


VECM(4):
(No cointegrating restrictions on the intercept parameters imposed)

Y(t)-Y(t-1) = A(1)(Y(t-1)-Y(t-2)) + .... + A(3)(Y(t-3)-Y(t-4) + a.b'Y(t-4) + c  + U(t),
where:

1: Y(t) is a 2-vector with components:
   Y(1,t) = LN[nominal consumption](t)
   Y(2,t) = LN[nominal GDP](t)

2: b'Y(t-4) = e(t-4), say, is the r-vector of error correction
terms, with b the 2xr matrix of cointegrating vectors,

3: c is a 2-vector of constants,

4: U(t) is the 2-vector of error terms.

5: a and the A(.)'s are conformable parameter matrices,

6: t = 37(=1948.1),...,223(=1994.3).


Matrix Skk:
0.9132700666 0.8989685819 
0.8989685819 0.8854573767 

Matrix Sko[Soo^-1]Sok:
0.0222742845 0.0232120848 
0.0232120848 0.0241894866 

Generalized Eigenvalues of Sko[Soo^-1]Sok w.r.t. Skk:
0.1557922928 0.0000325597 

Corresponding Eigenvectors:
-0.9736191737 1             
1             -0.9595275184 


LR test (Lambda-max test) of the null hypothesis that there are r 
cointegrated vectors against the alternative that there are r + 1 
cointegrated vectors
Table 1: No restriction on intercept. 
         C.f. Johansen & Juselius (1990), Table A1 
                      critical values     conclusions:
   r test statistic    20%  10%   5%      20%    10%     5%
   0           31.7   10.1 12.1 14.0   reject reject reject
   1            0.0    1.7  2.8  4.0   accept accept accept

LR test (trace test) of the null hypothesis that there are at most r 
cointegrated vectors against the alternative that there are 2 
cointegrated vectors
Table 1: No restriction on intercept. 
         C.f. Johansen & Juselius (1990), Table A1 
                      critical values     conclusions:
   r test statistic    20%  10%   5%      20%    10%     5%
   1            0.0    1.7  2.8  4.0   accept accept accept
   0           31.7   11.2 13.3 15.2   reject reject reject

LR test (Lambda-max test) of the null hypothesis that there are r 
cointegrated vectors against the alternative that there are r + 1 
cointegrated vectors
Table 2:  Restrictions on intercept, but not imposed. 
         C.f. Johansen & Juselius (1990), Table A2 
                      critical values     conclusions:
   r test statistic    20%  10%   5%      20%    10%     5%
   0           31.7   10.7 12.8 14.6   reject reject reject
   1            0.0    4.9  6.7  8.1   accept accept accept

LR test (trace test) of the null hypothesis that there are at most r 
cointegrated vectors against the alternative that there are 2 
cointegrated vectors
Table 2:  Restrictions on intercept, but not imposed. 
         C.f. Johansen & Juselius (1990), Table A2 
                      critical values     conclusions:
   r test statistic    20%  10%   5%      20%    10%     5%
   1            0.0    4.9  6.7  8.1   accept accept accept
   0           31.7   13.0 15.6 17.8   reject reject reject

Conclusion: r =1


Standardized cointegrating vector: 
-0.9736192  LN[nominal consumption]
         1  LN[nominal GDP]

Remark: This result will be used to test restrictions on the cointegrating vector.

VECM(4):
(No cointegrating restrictions on the intercept parameters imposed)

Y(t)-Y(t-1) = A(1)(Y(t-1)-Y(t-2)) + .... + A(3)(Y(t-3)-Y(t-4) + a.b'Y(t-4) + c  + U(t),
where:

1: Y(t) is a 2-vector with components:
   Y(1,t) = LN[nominal consumption](t)
   Y(2,t) = LN[nominal GDP](t)

2: b'Y(t-4) = e(t-4), say, is the 1-vector of error correction
terms, with b the 2x1 matrix of cointegrating vectors: b =
-0.9736191737 
1             

3: c is a 2-vector of constants,

4: U(t) is the 2-vector of error terms.

5: a and the A(.)'s are conformable parameter matrices,

6: t = 37(=1948.1),...,223(=1994.3).


ML estimation results for the VECM:
Parameter names:
Elements of the matrix A(k): A(i,j,k), k=1,..,3
Components of the vector a: a(i)
Components of the vector c: c(i)


Equation 1: DIF1[LN[nominal consumption]]
Parameter ML estimate
A(1,1,1)    -0.262023
A(1,2,1)     0.288099
A(1,1,2)     0.258230
A(1,2,2)     0.045008
A(1,1,3)     0.173523
A(1,2,3)    -0.207201
a(1)         0.071253
c(1)        -0.030769

s.e.:      8.75480E-03
R-Square:       0.2766
n:                 187

Equation 2: DIF1[LN[nominal GDP]]
Parameter ML estimate
A(2,1,1)     0.362645
A(2,2,1)     0.173270
A(2,1,2)     0.358585
A(2,2,2)    -0.034751
A(2,1,3)     0.158358
A(2,2,3)    -0.232077
a(2)        -0.048440
c(2)         0.033124

s.e.:      1.06084E-02
R-Square:       0.2739
n:                 187

ML estimate of the variance matrix of U(t):
0.0000733676 0.0000612372 
0.0000612372 0.0001077236 

Error correction term = LN[nominal GDP](-4)
                         -0.973619*LN[nominal consumption](-4)


Variables:
LN[nominal consumption]
LN[nominal GDP]

Null hypothesis: The vector H is a cointegrating vector, where H is a
nonzero 2 x 1 vector. The initial value of H is H = b. Replace the
components of H with fixed numbers. 

If you do not change the dimension of H, and the test accepts the null
hypothesis at the 10% significance level, the model will be re-estimated
with the parameters endowed with t and p values, except the cointegrating
vectors. 
If you do not change H at all, the test will not be conducted, of course,
but still the model will be re-estimated with the parameters endowed with
t and p values, except the cointegrating vectors. 

H:
-1 
1  

Null hypothesis: H is also a cointegrating vector.
LR test: Test statistic = 22.06.  Null distr.: Chi-square(1)
  Significance levels:        10%         5%
  Critical values:           2.71       3.84
  Conclusions:             reject     reject
  p-value = 0.00000
Variables:
LN[nominal consumption]
LN[nominal GDP]

Null hypothesis: The vector H is a cointegrating vector, where H is a
nonzero 2 x 1 vector. The initial value of H is H = b. Replace the
components of H with fixed numbers. 

If you do not change the dimension of H, and the test accepts the null
hypothesis at the 1% significance level, the model will be re-estimated
with the parameters endowed with t and p values, except the cointegrating
vectors. 
If you do not change H at all, the test will not be conducted, of course,
but still the model will be re-estimated with the parameters endowed with
t and p values, except the cointegrating vectors. 

H:
-0.9736191737 
1             

Null hypothesis: H is also a cointegrating vector.

However, since you have not changed H, testing the null hypothesis makes no sense!

Re-estimation of the VECM 
VECM(4):
(No cointegrating restrictions on the intercept parameters imposed)

Y(t)-Y(t-1) = A(1)(Y(t-1)-Y(t-2)) + .... + A(3)(Y(t-3)-Y(t-4) + a.b'Y(t-4) + c  + U(t),
where:

1: Y(t) is a 2-vector with components:
   Y(1,t) = LN[nominal consumption](t)
   Y(2,t) = LN[nominal GDP](t)

2: b'Y(t-4) = e(t-4), say, is the 1-vector of error correction
terms, with b the 2x1 matrix of cointegrating vectors: b =
-0.9736191737 
1             

3: c is a 2-vector of constants,

4: U(t) is the 2-vector of error terms.

5: a and the A(.)'s are conformable parameter matrices,

6: t = 37(=1948.1),...,223(=1994.3).


ML estimation results for the VECM:
Parameter names:
Elements of the matrix A(k): A(i,j,k), k=1,..,3
Components of the vector a: a(i)
Components of the vector c: c(i)


Equation 1: DIF1[LN[nominal consumption]]
Parameter ML estimate t-value [p-value]
A(1,1,1)    -0.262023   -2.68 [0.00730]
A(1,2,1)     0.288099    3.47 [0.00051]
A(1,1,2)     0.258230    2.39 [0.01663]
A(1,2,2)     0.045008    0.54 [0.59159]
A(1,1,3)     0.173523    1.73 [0.08433]
A(1,2,3)    -0.207201   -2.81 [0.00503]
a(1)         0.071253    2.95 [0.00321]
c(1)        -0.030769   -2.22 [0.02616]

s.e.:      8.75480E-03
R-Square:       0.2766
n:                 187

Equation 2: DIF1[LN[nominal GDP]]
Parameter ML estimate t-value [p-value]
A(2,1,1)     0.362645    3.06 [0.00218]
A(2,2,1)     0.173270    1.72 [0.08473]
A(2,1,2)     0.358585    2.74 [0.00606]
A(2,2,2)    -0.034751   -0.34 [0.73244]
A(2,1,3)     0.158358    1.30 [0.19361]
A(2,2,3)    -0.232077   -2.59 [0.00951]
a(2)        -0.048440   -1.65 [0.09821]
c(2)         0.033124    1.98 [0.04819]

s.e.:      1.06084E-02
R-Square:       0.2739
n:                 187

[The p-values are two-sided and based on the normal approximation]

ML estimate of the variance matrix of U(t):
0.0000733676 0.0000612372 
0.0000612372 0.0001077236 

Error correction term = LN[nominal GDP](-4)
                         -0.973619*LN[nominal consumption](-4)

References

Johansen,S. (1988), "Statistical Analysis of Cointegrating Vectors", Journal of Economic Dynamics and Control 12, 231-254

Johansen,S. (1991), "Estimation and Hypothesis Testing of Cointegrating Vectors in Gaussian Vector Autoregressive Models", Econometrica 59, 1551-1580

Johansen,S. (1994), "The Role of the Constant and Linear Terms in Cointegration Analysis of Nonstationary Variables", Econometric Reviews 13(2)

Johansen,S. and K.Juselius (1990), "Maximum Likelihood Estimation and Inference on Cointegration, with Applications to the Demand for Money", Oxford Bulletin of Economics and Statistics 52, 169-210

This is the end of the guided tour on Johansen's cointegration analysis