Intermediate Econometrics: Assignment 1
Simple Regression Model
National School of Development
March 12, 2024
1 Theoretical Deduction
Consider the simple linear regression model:
g = β0 + β1① + u (1)
where g is the dependent variable, ① is the independent variable,u is the error term. u includes the unobservable factors. β0 is the intercept parameter, β1 is the slope parameter.
Q1. What is the Chinese translation of the concept error term?
We want to estimate the paramters β0 and β1 in this model.
1.1 Method of Moments
Let’s obtain the explicit form. of the estimators of these two parameters. The main assumption we need is E(u| ①) = 0.
Q2. Let u and ① be random variables in (1). Show that:
E(u| ①) = 0 ⇒ E(u) = 0. (2)
Q3. True or False: If Cou(①)u) = 0, then E(u| ①) = 0 for every ①.
Using condition (2), we can obtain another condition.
Q4. Show that under condition (2), we have:
Cou(①)u) = E(①u) = 0
Equation (1) is the population version of the regression. Let’s rewrite it into sample version:
yi = β0 + β1 xi + ui ,
where {(xi , yi .
Rewrite condition (2) and the second equality of (3) into sample form.
ui = 0
xi ui = 0
We call equation (2) and (3) to be population moment conditions. We call equation (6) and (7) to be sample moment conditions.
ui = yi − β0 − β1 xi (7)
Plug this into (6), we can obtain
1 n
xi (yi − β0 − β1 xi ) = 0 (8)
The explicit form of the estimator is some algebraic expression using oberved data tuples (xi , yi ), i = 1, 2, . . .. We use symbols with ˆ on the top to denote estimators, i.e., β(ˆ)1 is the estimator of β1 ,β(ˆ)0 is the estimator of β0 , etc..
Q5. Define ¯(y) =
Σ xi. Use the sample moment conditions to show that:
β(ˆ)0 = ¯(y) −β(ˆ)1 ¯(x) (9)
Q6. Use the sample moment conditions and equation (9) to show that:
β(ˆ)1 =
Σ(xi − ¯(x))(yi − ¯(y)) (10)
Now forget about equations (9) and (10). Rather than MoM, we want to use another method, OLS, to recalculate the explicit form of the estimators of parameter β0 and β1. Define the fitted value of
yi as
ˆ(y)i = β(ˆ)0 +β(ˆ)1 xi.
Note that here we have not known how to express β(ˆ)0 and β(ˆ)1 yet.
Define the residual of regression (1) to be
ˆ(u)i = yi − ˆ(y)i
Q7. What is the Chinese translation of the concept residual?
Q8. Can you draw a scatter graph with a fitted line, and depict ˆ(y)i and ˆ(u)i in it?
Define Residual Sum of Squares (SSR) of regression to be
ˆ(u)i 2 (13)
So we have
SSR = ˆ(u)i 2
(=12) (yi − ˆ(y)i )2 (14)
(=11) (yi − β(ˆ)0 − β(ˆ)1 xi )2
Define Ordinary Least Square method to be choosing some β(ˆ)0 and β(ˆ)1 to minimize SSR. This is an optimization problem. We can use derivation to solve it. Assume the function to be optimized, SSR(β(ˆ)0 , β(ˆ)1 ), is well-defined so that we can use the first order conditions (F.O.C.s) to obtain the explicit estimators.
The optimization problem can be written as
(yi − b0 − b1 xi )2 (15)
Q9. Write out the first order conditions of this optimization problem. (For simplicity, teaching assis- tants have helped you to list them here! But you should know how to derive them using derivation.) Solve for the OLS estimators.
Soln. The F.O.C.s of this problem are
(w.r.t. b0 ) [−2(yi − b0 − b1 xi )] = 0
(w.r.t. b1 ) [−2(yi − b0 − b1 xi )xi] = 0
where ”w.r.t.” stands for ”with respect to”. Therefore, the OLS estimators are ...
Q10. Compare your answer with equations (9) and (10). Are they the same? Did you use the assumption E(u|x) = 0 to obtain OLS estimators?
Q11. 1. Define xi(′) = axi + band regress yi on xi(′) by yi = β0(′) + β1(′)xi(′) + ui(′). What is the estimate
of β1(′)?
2. Define yi(′) = ayi + band regress yi(′) on xi by yi(′) = β0(′) + β1(′)xi + ui(′). What is the estimate of β1(′)?
3. Replace yi with ln(yi ). Assume yi > 0 for i = 1, 2, .... Regress ln(yi ) on xi by ln(yi ) = β0(′) + β1(′)xi + ui(′). What is the economic interpretation of β1(′)?
4. Replace yi with ln(yi ) and replace xi with ln(xi ). Assume yi > 0 and xi > 0 for i = 1, 2, .... Regress ln(yi ) on ln(xi ) by ln(yi ) = β0(′) + β1(′)ln(xi ) + ui(′). What is the economic interpretation
of β1(′)?
1.3 Some algebraic properties of simple linear regression
Now, define zero conditional mean condition to be
E(u|x) = 0.
Recall the OLS estimators of simple linear regression. Use data {(xi , yi ) : i = 1,...., n} to fit the
model, and obtain estimates
β(ˆ)1 =
β(ˆ)0 =
Σ(xi − ¯(x))(yi − ¯(y))
¯(y) − β(ˆ)1 ¯(x)
Recall that, when deriving the OLS estimator, we did not assume zero conditional mean (on the contrary, at the start of deriving MoM estimators, we made this assumption). The assumptions we need for deriving OLS estimators were trivial:
(i) The function to be optimized in the optimization problem (15) should be well-defined;
(ii) {xi (xi − ¯(x)) 0.
Also, while deriving algebraic properties of OLS estimator, we do not need the zero conditional mean condition neither.
Q12. Use the definition of the OLS residual ˆ(u)i , show that:
ˆ(u)i = 0.
Q13. Define sample covariance estimator between regressor xi and OLS residual ˆ(u)i to be
ˆ(u)i ) ≡ xi ˆ(u)i − ( xi )( ˆ(u)i )
≈ xi ˆ(u)i − ( xi )( ˆ(u)i )
Under F.O.C. w.r.t. β1 of the optimization problem (15), with the conclusion (18) we have obtained just above, show that:
—
Cov(xi , ˆ(u)i ) = 0.
xi ˆ(u)i = 0. (20)
Equations (18) and (20) are important properties and will be needed sooner.
Q15. Define ˆ(y) =
ˆ(y) = ¯(y) . (21)
Define
lxx = (xi − ¯(x))2 , lgg = (yi − ¯(y))2 , lxg = (xi − ¯(x))(yi − ¯(y)).
So we have
β(ˆ)1 = lxx(lxg) .
Q16. Show that
lxg = (xi − ¯(x))yi = xi (yi − ¯(y)).
Q17. Define (Pearson) correlation coefficient
Corr(xi , yi ) = Cov(xi , yi ) .
(Var(xi ) (Var(yi )
Define sample correlation coefficient
Co, yi ) =
lxg
(lxx lgg
Define R-squared to be
R2 = .
R2 =
Sketchy Hints. 1.
—
(Corr(xi , yi ))2 =
= (Corr(xi , yi ))2 . (23)
[Σ(yi − ¯(y))(ˆ(y)i − ˆ(y))]2
Σ(yi − ¯(y))2 Σ(ˆ(y)i − ˆ(y))2
2. Focus on the numerator part:
[(yi − ¯(y))(ˆ(y)i − ˆ(y))]2 .
Do identical transformation
[(yi − ˆ(y)i + ˆ(y)i − ¯(y))(ˆ(y)i − ˆ(y))]2 .
3. Use conclusion (21), (18), and (20) to show that the numerator can be collasped into
(ˆ(y)i − ¯(y))2 .
Q18. 1. Fit another linear regression
xi = δ0 + δ1yi + ei. (24)
Show that δ(ˆ)1 = lxg /lgg , where δ(ˆ)1 is the OLS estimate of δ1 .
2. Define the R-squared of regression (24) to be R2′ . R2 is the R-squared in (23). Show that:
R2 = R2′ =β(ˆ)1 δ(ˆ)1 .
1.4 Is the OLS estimator appropriate?
Now we have already obtained the explicit estimator for the simple linear regression. Note that β(ˆ)0 and β(ˆ)1 are just two numbers now. They are obtained by some ”+ − ×÷” relationships between observable data points, and have no statistic implications upon real world.
To make use of the estimators, we need to derive some properties of them. We should not treat
the β(ˆ)0 and β(ˆ)1 simply as numbers in this section. We will regard them as compounds of random
variables.
We need to know how appropriate the estimators are. Three properties are used to define the appropriateness:
1. Unbiasedness
2. Consistency
3. Efficiency
You may find the definitions of unbiasedness in Slides Chapter 3 and efficiency (denoted as ”best”) in Slides Chapter 4.
Q19. Under the assumption E(ui |xi ) = 0, show that the OLS estimator β1 is unbiased. That is, to show
E(β(ˆ)1 ) = β1 .
(Hint: You may refer to Slides Chapter 3 pp. 68.)
2 Stata Exercises
1. Inverse regression.
(1) Run the following codes in STATA.
setobs100
genz=rnormal()
genu1=rnormal()
genu2=rnormal()
genx=z+0.4*u1 geny=z+0.4*u2
s c a t t e r y x
Now you generate 100 pairs. You can see that y and x are distributed near the line y = x. (2) Regress y on x.
re g y x
Interprete the results. (If x increases by a unit, you observe how much increase in y?) (3) Regress x on y.
reg x y
Interprete the results. (If y increases by a unit, you observe how much increase in x?)
(4) Compare these two slope coefficients. Are these results consistent? How do you interprete
the results? You can change the sample size and run the regressions again to confirm your claim.
2. The following questions aim to provide a taste of the estimation and inference of multiple regression research design. You may leave this part blank by submission and come back after
March 1 9 , 2024, when we will learn multiple regression (inference, part 1). No points will be deducted for no answers.
(1) Adventure (male) and Angel (female) started entrepreneurship which helps others to ille- gally cross the border and sneak into BD Island. They took advantage of loopholes in the access system of BD Island, obtained multiple fake identity cards, and planned to smuggle a group of wanderers to BD Island this Sunday. The security department of BD Island learned about this information in advance and decided to strengthen their vigilance. Now, A-A team are deciding whether to continue this smuggling operation. Please define rele- vant variables and use a regression model to describe A-A team’s action strategy.
(Hints: What is the dependent variable? What are potential dependent variables?)
(2) Now re-consider the previous setting from a provincial perspective. Please define relevant variables and use a regression model to describe potential factors contributing to provincial economic crime rates.
(Hints: The dependent variable now is provincial economic crime rates. What are potential dependent variables? What are the expected signs of the coefficients?)
版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。