ECON0019: Quantitative Economics & Econometrics
Notes for Term 1
Dennis Kristensen
University College London
November 2, 2021
These notes are meant to accompany Wooldridge?s "Introductory Econometrics" in providing
more details on the mathematical results of the book.
Part I
Simple Linear Regression (SLR)
We are interested in estimating the relationship between x and y in a given population. Suppose
the following two assumptions are satis?ed:
SLR.1 In the population, the following relationship holds between x and y:
y = 0 + 1x+ u; E [ujx] = 0: (0.1)
where 0 and 1 are unknown parameters and u is an unobserved error term.
The error term u captures other factors, in addition to x, that in?uences/generates y. In
order to be able to disentangle the impact of x from these other factors, we will require u to be
mean-independent of x:
SLR.4 E [ujx] = 0.
That is, conditional on x the expected value of u in the population is 0; it implies that no x
conveys any information about u on average. It is here helpful to remind ourselves what conditional
expectations are:
1 Refresher on conditional distributions
Consider two random variables, y and x. These do not have to satisfy the SLR (or any other
model). Suppose for simplicity that both are discrete values (all subsequent arguments and results
1
easily generalise to the case where they are continuously distributed). Let
x(1); :::; x(K)
, for some
K 1, be the possible values that x can take and y(1); :::; y(L) , for some L 1, be the possible
values of y. Now, let
pX;Y
x(i); y(j)
= Pr
x = x(i); y = y(j)
; i = 1; :::;K; j = 1; :::; L:
be the joint probability function. From this we can, for example, compute the marginal distributions
of x and y,
pX (x) =
LX
j=1
p
x; y(j)
; pY (y) =
KX
i=1
p
x(i); y
;
for any given x 2 x(1); :::; x(K) and y 2 y(1); :::; y(L) .
The conditional distribution of yjx is de?ned as
pY jX (yjx) =
pY;X (y; x)
pX (x)
; (1.1)
for any values of (y; x). The conditional distribution slices the distribution of y up according to the
value of x. pY jX
y(j)jx(i) is the probability of observing y = y(j) in the subpopulation fo which
x = x(i). If x is informative about y (that is, they are dependent) then pY jX (yjx) 6= pY (y).
In this course we are generally interested in modelling conditional distributions because we are
interested in causal e∟ects. So very often we will write up a model for pY jX (yjx). Any given model
of pY jX (yjx) can then be used to, for example, make statements about the marginal distribution
of y since
pY (y) =
KX
i=1
pY jX
yjx(i)
pX
x(i)
: (1.2)
Example: Male and Female wages. Consider the population of UK adults. Let y 2 f0; 1; 2; ::::::; Lg
be the log weekly earnings in pounds of a UK adult (binned so that it is discrete and with
number of bins (L) being some very large number) in a given year, and x 2 f0; 1g being a
dummy variable indicating the gender of that same individual: x = 0 if the individual is male
while x = 1 the individual is female.
In this case, pY jX (yjx = 0) is the UK log-wage distribution for men and pY jX (yjx = 1) is the
log-wage distribution for women. If men and women have di∟erent earnings distributions,
then pY jX (yjx = 0) 6= pY jX (yjx = 1).
The over-all UK wage distribution is
pY (y) = pX (x = 0) pY jX (yjx = 0) + pX (x = 1) pY jX (yjx = 1) ;
where p (x = 0) and p (x = 1) is the proportion of men and women in the UK, respectively.
In 2018, the UK population was 66.78 million, with 33.82 million females and 32.98 million
males. Thus, if the year of interest is 2018,
pX (x = 0) =
32:98
66:78
= 0:49; pX (x = 1) =
33:82
66:78
= 0:51: (1.3)
2
For any given value of x, pY jX (yjx) is a probability distribution. In particular, we can compute
means and variances, etc. For example, the conditional mean is de?ned as
E [yjx] =
LX
j=1
y(j)pY jX
y(j)jx
:
More generally, for any function f (y) ;
E [f (y) jx] =
LX
j=1
f
y(j)
pY jX
y(j)jx
:
For example, the conditional variance is given by
Var (yjx) = E
h
(y E [yjx])2 jx
i
= E
y2jx E [yjx]2 :
Example: Male and Female wages. In the above wage example,
E [yjx = 0] = the average log-wages for men
E [yjx = 1] = the average log-wages for women
and
Var (yjx = 0) = the spread/variance of log-wages for men
Var (yjx = 1) = the spread/variance of log-wages for women
1.1 Some useful rules for computing conditional expectations
Recall that for any two constants a and b,
E [ay + b] = aE [y] + b:
When we compute expectations conditional on x, we can treat x as a constant (we keep it ?xed at
the particular value). Thus, the following rules applies: For any two functions a (x) and b (x),
E [a (x) y + b (x) jx] = a (x)E [yjx] + b (x) :
Example: Male and Female wages. Suppose (y; x) satis?es SRL.1?SLR.4 with 0 = 3 and
1 = 0:25. What is the average log-wages for men and women, respectively? Taking
conditional expectations on both sides of eq. (0.1) and then using the above rules, we obtain
E [yjx] = E [3 0:25x+ ujx] = 3 0:25x+ E [ujx] = 3 0:25x;
where the last equality uses SLR.4. That is, E [ujx = 0] = E [ujx = 1] = 0.1 Thus,
E [yjx = 0] = 3 0:25 0 = 3;
E [yjx = 1] = 3 0:25 1 = 2:75:
So in this example average log?earnings for men are higher than those for women.
1 Is it reasonable that SLR.4 holds for this regression? To answer this, you should ?rst determine which factors
a∟ect wages besides gender. Next, you should contemplate whether these factors are mean?independent of gender.
3
Recall eq. (1.2) which relates the marginal distribution of y to its conditional distribution.
Similarly, we can link the unconditional mean of y, E [y], to the conditional ones, E [yjx]:
Theorem 1.1 (Law of iterated expectations) For any two random variables y and x, the
following hold:
版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。