The nonparametric kernel regression estimator involved is briefly described in PDF file KERNELREG.PDF. Before you use the kernel regression module (KERNELREG), please read this PDF file first.
Note that nonparametric kernel regression estimation is an advanced feature of EasyReg International. If you are a novice econometrician you should not use it.
In order to demonstrate how kernel regression works, I have generated n = 500 independent
standard normally distributed random variables X1, X2, and U, and combined them into
a dependent variable
This module does not allow you to select more than two explanatory variables, because only univariate and bivariate regression functions can be plotted.
I will demonstrate the bivariate case first.
Open "Menu > Single equation models > Nonparametric kernel regression", and select
Y, X1 and X2 as the data in the usual way, with Y the dependent variable,
and X1 and X2 the independent variables. Then the first kernel regression window is:
As noted in KERNELREG.PDF, the constant c of the bandwidth can be determined by cross-validation. Here I have specified the lower bound of c as 0.5 and the upper bound as 2:
Click "Bound OK". Then the following window appears.
I have chosen 4 grid points. Click "Grid OK". Then the following window appears.
Click "Continue". Then the mean square error will be minimized over these grid points.
The optimal c is 1:
Click "Make plot data".
If a nonparametric regression estimator is computed for values of the X variables
for which the density is close to zero, the estimate will be unreliable. Therefore,
the plot range of the X variables should not be too wide. Since X1 and X2 are standard normally
distributed, I have chosen the plot range [-2,2] for both X1 and X2.
The grid points are the grid points of the 3-dimensional plot in the directions of the X variables involved. The default value 29 usually gives the best picture.
Once the plot range and grid points have been specified, the plot data is computed, which takes
a few minutes in this case, and when done the module PLOT3DIM is activated. This module
opens with a blank picture window.
Once you click the "Start" button, the picture is displayed:
Note that at the edges of the plot area
In this example the plot area can easily be determined from the design, but in general you do not know the actual distribution of the X variables. In that case I recommend to open "Menu > Data analysis > Summary statistics", select the X variables involved, and then use the 10% and 90% quantile values as lower and upper bounds of the plot range. In our case we have
10% quantile X1 = -1.36670 90% quantile X1 = 1.30098 10% quantile X2 = -1.17452 90% quantile X2 = 1.32714
If we choose these quantiles as the plot range, the result looks indeed much better:
Finally, just as a warning, let me show you what happens if you do not adjust the plot ranges,
but just accept the minimum and maximum values:
Close to the borders of the plot area there is hardly any data to support the kernel regression function estimator, which yield spurious results.
Select Y and X1 as the data in the usual way, with Y the dependent variable,
and X1 the independent variable.
Now proceed in the same way as before. The cross-validated c is again 1. Moreover, choose the plot range [-1.36670,1.30098]:
You now have the option to compare the kernel regression curve with the linear regression line, but I will not choose this option. Then the plot result is:
Recall that the true regression function is