Discrete dependent variable models are models for which the dependent variable Y takes discrete values only. There are three types of discrete dependent variable models:
As the term "count" indicates, in the case of count data models the dependent variable Y represents a quantity. For example, let Y be the number of accidents during rush hour at a particular intersection. Thus, in the case of count data the dependent variable Y is a count of something.
In EasyReg International you can choose from three count data models, depending on whether the dependent variable Y has an explicit finite upperbound or not.
These models are explained in LIMDEP.PDF.
Now suppose that the values of Y represent an ordering of items. For example, let Y be the outcome of a taste test, coded like
In this case Y is not a quantity, but nevertheless a larger value of Y means more, or better. In this case there exists a known smallest natural number m such that
similar to the Binomial Logit model, but now the values of Y represent a ranking rather than a count. The upper bound m is determined by EasyReg itself.
This type of data is usually modeled via a latent variable model:
where the error term e is independent of X and has distribution function F(.). The latent dependent variable Y* is not observable, but related to the observed Y in the following way:
P[Y = 0 | X] = P[Y* Î (-¥,0] | X] = P[e Î (-¥,-a - b'X] | X] = F(-a - b'X)
P[Y £ 1 | X] = P[Y* Î (-¥,m1] | X] = P[e Î (-¥,m1 - a - b'X] | X] = F(m1 - a - b'X)
........................
P[Y £ m-1 | X] = P[Y* Î (-¥,mm-1] | X] = P[e Î (-¥,mm-1 - a - b'X] | X] = F(mm-1 - a - b'X)
where
The latter conditions can be easily enforced by reparametrizing the m's as follows:
Thus the model for Y now becomes:
P[Y = 0 | X] = F(-a - b'X)
P[Y = 1 | X] = F(exp(g1) - a - b'X) - F(-a - b'X)
........................
P[Y = j | X] =
F(å1 £ i £ j
exp(gi)
- a -
b'X)
-
F(å1 £ i £ j-1
exp(gi)
- a -
b'X)
j = 2,....,m-1
........................
P[Y = m | X] = 1 - F(å1 £ i £ m-1 exp(gi) - a - b'X)
In EasyReg you have only one option for F, namely the Logistic distribution
function
If the dependent variable Y is dichotomous, i.e., P[Y Î {0,1}] = 1, then it follows from the above discussion of the ordered logit model and the symmetry of the distribution function F that
P[Y = 0 | X] = F(-a - b'X) = 1 - F(a + b'X)
hence
P[Y = 1 | X] = F(a + b'X).
In EasyReg you have two options for F, namely is the Logistic distribution function F(x) = 1/[1 + exp(-x)] (the Logit model) and the distribution function of the standard normal fistribution (the Probit model). Read PROBIT_LOGIT.PDF for a comparision of Probit and Logit analysis.
The case Y = 1 usually stands for an attribute you are interested in.
For example, let Y = 1 mean
that "an applicant for a mortgage loan will default on the loan", and let X
be the vector of characteristics
of the applicant, such as income, family situation, credit rating, etcetera.
Given a data set of people who have gotten
a mortgage loan, together with their payment history and their characteristics
X, you can estimate the
parameters a and b, and then use
Remark: The Logit model in EasyReg is defined as
However, some statistical packages, in particalur SAS, define the logit model as
Both are equivalent, of course, as g = - a
and d = - b,
but (1) makes more sense than (2) because (1) is monotonic increasing in
Now suppose that you have a vector of dependent dummy variables. For example, suppose you hold a survey in which you ask the respondents which brand of dishwasher detergent they use:
The dependent dummy variables involved can be recoded into a single variable Y, for example, let:
The standard model for this case is the multinomial logit model. For
where
In the case m = 1 this model reduces to the Logit model.
Note that
- exp(aj + bj'X)(P[Y = 0 | X])2[å1 £ i £ m exp(ai + bi'X)bi]
so that the direction of the effect of X on P[Y = j | X] depends on all the parameters. On the other hand,
- (exp(aj + bj'X)(P[Y = 0 | X])2X'X
= {P[Y = j | X] - (P[Y = j | X])2}X'X > 0.
All the models discussed above are estimated by maximum likelihood. Moreover, in all cases the log-likelihood function is unimodal in the parameters. Therefore, the log-likelihood is maximized by using the Newton iteration.
Download the following variables from the EasyReg database:
Open "Menu > Single equation models > Discrete dependent variable models", choose these variables, choose "Function level" as the dependent variable, and include an intercept.
EasyReg checks whether the dependent variable is discrete or not. If not, you will not be able to continue.
EasyReg will automatically recode the dependent variable as Y = Function level - 1.
Since the dependent variable represents an ordering, an ordered logit model is appropriate.
Click "Continue". Then the "What to do next?" window appears.
where X is the vector of explanatory variables, hence
The residuals are the estimates of U for each observation in the sample.
If you use the option "Write residuals to the input file"
the estimates of the conditional expectation g(X) can be constructed
via Menu > Input > Transform variables, and the linear combination
The output involved (without the asymptotic variance matrix) is given below.
Ordered Logit model:
Dependent variable:
Y = function level (1-9)
Characteristics:
function level (1-9)
First observation = 1
Last observation = 2000
Number of usable observations: 2000
Minimum value: 1.0000000E+000
Maximum value: 9.0000000E+000
Sample mean: 3.7775000E+000
This variable is integer valued.
A discrete dependent variable model is suitable.
X variables:
X(1) = education level (1-7)
X(2) = male/female (1/2)
X(3) = age in years
X(4) = experience in years
X(5) = 1
Model:
P(Y-1=0|x) = F(-b'x)
P(Y-1=1|x) = F(-b'x+exp(b(6)))- F(-b'x)
For j=2,..,m-1,
P(Y-1=j|x) = F(-b'x+exp(b(6))+..+exp(b(5+j)))
- F(-b'x+exp(b(6))+..+exp(b(4+j)))
and
P(Y-1=m|x) = 1 - F(-b'x+exp(b(6))+..+exp(b(4+m)))
where m =8, b'x = b(1)x(1)+..+b(5)x(5), and
F(u) = 1/[1+EXP(-u)].
Newton iteration succesfully completed after 15 iterations
Last absolute parameter change = 0.0000
Last percentage change of the likelihood = -0.1018
Maximum likelihood estimation results:
Par. ML estimate s.e. t-value [p-value] Variable
b(1) 1.015007 0.034594 29.34 [0.00000] education level (1-7)
b(2) -0.315749 0.115353 -2.74 [0.00620] male/female (1/2)
b(3) 0.028351 0.004702 6.03 [0.00000] age in years
b(4) 0.024434 0.006309 3.87 [0.00011] experience in years
b(5) -0.373515 0.267649 -1.40 [0.16285] 1
b(6) 0.964437 0.041202 23.41 [0.00000]
b(7) 0.825036 0.034042 24.24 [0.00000]
b(8) -2.233505 0.187036 -11.94 [0.00000]
b(9) -2.169395 0.184080 -11.79 [0.00000]
b(10) 0.383856 0.055109 6.97 [0.00000]
b(11) -0.140443 0.088979 -1.58 [0.11448]
b(12) 0.587651 0.082432 7.13 [0.00000]
[The two-sided p-values are based on the normal approximation]
Log likelihood: -1.27523282560E+003
Sample size (n): 2000
Information criteria:
Akaike: 1.287232826
Hannan-Quinn: 1.299572029
Schwarz: 1.320838240
The parameters
where f is the density corresponding to F. Therefore, the interpretation of the effect of X on
Now estimate a multinomial logit model using the same data. Of course, the multinomial logit model does not take the ordering of Y into account, so that this model is not appropriate. Nevertheless, in doing so something interesting will happen:
Click "Yes":
The full text in this window is listed below.
What to do if you get the message that some X variables are constant
for some values of Y in the multinomial logit model?
As an example, try to estimate the following multinomial
logit model, using the cross-section data of Dutch wage earners in
the EasyReg database:
Model variables:
y = function level (1-9)
x(1) = education level (1-7)
x(2) = male/female (1/2)
x(3) = age in years
x(4) = experience in years
x(5) = 1
Available observations: t = 1 -> 2000
= Chosen.
Model:
P(y-1=0|x) = 1/[1+exp(b(1)'x)+ ..+ exp(b(m)'x)]
P(y-1=j|x) = exp(b(j)'x)P(y=0|x), j=1,..,m, where m = 8
Then you will get the message:
X(2) and X(5) are constant for Y = 5
Program aborted!
The problem is that there are no females in function level 5, so that
X(2) = 1 for all workers in function level 5. The variable X(2) is
therefore perfectly multicollinear with the intercept X(5)!
The solution is to merge function level 5 with other function
levels, for example with function levels 4 and 6, yielding a new
dependent variable with values ranging from 1 to 7. In order to do
this, we have to transform the function levels to dummy variables.
Create the following dummy variables, using the I(x=a) option
in the transformation menu:
I[function level (1-9)=1]
I[function level (1-9)=2]
I[function level (1-9)=3]
I[function level (1-9)=4]
I[function level (1-9)=5]
I[function level (1-9)=6]
I[function level (1-9)=7]
I[function level (1-9)=8]
I[function level (1-9)=9]
where I(.) is the indicator function: I(true) = 1, I(false) = 0).
Next, create the following new dependent variable, using
the linear combination option in the transformation menu:
New function level (1-7) =
1 x I[function level = 1]
+2 x I[function level = 2]
+3 x I[function level = 3]
+4 x I[function level = 4]
+4 x I[function level = 5]
+4 x I[function level = 6]
+5 x I[function level = 7]
+6 x I[function level = 8]
+7 x I[function level = 9]
Using this new dependent variable, the multinomial logit
estimation results are:
Model variables:
y = New function level (1-7)
x(1) = education level (1-7)
x(2) = male/female (1/2)
x(3) = age in years
x(4) = experience in years
x(5) = 1
Available observations: t = 1 -> 2000
= Chosen.
Model:
P(y-1=0|x) = 1/[1+exp(b(1)'x)+ ..+ exp(b(m)'x)]
P(y-1=j|x) = exp(b(j)'x)P(y=0|x), j=1,..,m, where m = 6
Maximum likelihood estimation results:
Variable ML estimate of b(.) (t-value)
x(1)=education level (1-7) b(1,1) = 0.6617117 (5.46)
[p-value = 0.00000]
x(2)=male/female (1/2) b(1,2) = -0.2669086 (-1.02)
[p-value = 0.30564]
x(3)=age in years b(1,3) = -0.0161398 (-1.41)
[p-value = 0.15763]
x(4)=experience in years b(1,4) = 0.0489719 (2.35)
[p-value = 0.01897]
x(5)=1 b(1,5) = 0.8470765 (1.37)
[p-value = 0.17029]
x(1)=education level (1-7) b(2,1) = 1.2491677 (10.15)
[p-value = 0.00000]
x(2)=male/female (1/2) b(2,2) = -0.2660622 (-1.00)
[p-value = 0.31854]
x(3)=age in years b(2,3) = -0.0132317 (-1.13)
[p-value = 0.25919]
x(4)=experience in years b(2,4) = 0.0968024 (4.62)
[p-value = 0.00000]
x(5)=1 b(2,5) = -0.9266000 (-1.45)
[p-value = 0.14587]
x(1)=education level (1-7) b(3,1) = 1.9351154 (14.47)
[p-value = 0.00000]
x(2)=male/female (1/2) b(3,2) = -0.8985736 (-2.80)
[p-value = 0.00510]
x(3)=age in years b(3,3) = 0.0201374 (1.52)
[p-value = 0.12764]
x(4)=experience in years b(3,4) = 0.0994764 (4.48)
[p-value = 0.00001]
x(5)=1 b(3,5) = -4.5914640 (-6.18)
[p-value = 0.00000]
x(1)=education level (1-7) b(4,1) = 2.5490455 (16.76)
[p-value = 0.00000]
x(2)=male/female (1/2) b(4,2) = -1.3678589 (-2.75)
[p-value = 0.00598]
x(3)=age in years b(4,3) = 0.0453044 (2.60)
[p-value = 0.00939]
x(4)=experience in years b(4,4) = 0.0929477 (3.57)
[p-value = 0.00035]
x(5)=1 b(4,5) = -8.9553810 (-8.57)
[p-value = 0.00000]
x(1)=education level (1-7) b(5,1) = 2.8232526 (18.12)
[p-value = 0.00000]
x(2)=male/female (1/2) b(5,2) = -0.8957252 (-1.93)
[p-value = 0.05404]
x(3)=age in years b(5,3) = 0.0847234 (4.87)
[p-value = 0.00000]
x(4)=experience in years b(5,4) = 0.0526355 (1.98)
[p-value = 0.04734]
x(5)=1 b(5,5) = -12.0151756 (-11.05)
[p-value = 0.00000]
x(1)=education level (1-7) b(6,1) = 2.4593404 (14.13)
[p-value = 0.00000]
x(2)=male/female (1/2) b(6,2) = -2.3332418 (-2.20)
[p-value = 0.02763]
x(3)=age in years b(6,3) = 0.0763263 (3.54)
[p-value = 0.00041]
x(4)=experience in years b(6,4) = 0.0456722 (1.45)
[p-value = 0.14763]
x(5)=1 b(6,5) = -9.2689909 (-5.73)
[p-value = 0.00000]
[The two-sided p-values are based on the normal approximation]
Log likelihood: -2.54992886753E+003
Sample size (n): 2000
Note that the parameter