联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp

您当前位置:首页 >> Java编程Java编程

日期:2019-10-30 10:21

1 Homework 6: Multivariate Regression

1.1 Purpose

Homework 6 is meant to give you some practice on understanding what can go

wrong with multivariate regression.

1.2 What needs to be returned?

• Please upload a typed out solution for the following questions to CourseWorks

before class starts.

1.3 Math to Code

1.3.1 Q1

Define a random vector with 3 random variables:

• X ∼ Normal(0, 10)

• Y ∼ Exp(λ = 0.1)

• Z = Y + 2 ∗ X + , where ∼ Unif[−5, 5]

Please assume that X, Y , and are all independent from one another.

Please calculate the theoretical values of Cov

1.3.2 Q2, numerically approximating covariances

Please test out 2 sample sizes, 100 and 10000 to numerically approximate the 3

by 3 covariance matrix from Q1 via simulation.

You should use the sample covariance to approximate the theoretical covariance

matrix, i.e. Σˆn ≈ Σ. Σˆn is the sample covariance based on sample size n.

Create 200 simulations for each sample size to approximate the covariance

matrix above (i.e. you would have 200 covariance estimates for each sample size).

If the theoretical covariance matrix is Σ and the estimate is Σ, define the error ˆ

as kΣ − Σˆk2 = kDk2 =qP3i=1

P3j=1 D2i,j ). This is called the Frobenius norm

for the matrix D. Please report the 2.5 to 97.5 percentile values of the Frobenius

norm, across the 200 simulations for each sample size. Please comment on the

sample size’s effect on the accuracy of the numerical approximation of Σ.

1

1.3.3 Q3, a common abuse of the word ”sample size”

In this question, the word ”sample size” is used in 2 different ways that is

common and confusing.

In the regression setting, we often describe Y = Xβ + , where Cov(|X) =σ2

I as a n × n matrix. This σ2

I is the theoretical covariance.

In the case where n = 20, i.e. the sample size of the regression is 20, please

write the code that would numerically approximate the covariance

matrix σ2

I using the sample size with the smaller error from Q2 by

simulating different  vectors. Please set σ

2 = 4. I’m intentionally not

prescribing the distribution of , choose your favorite distribution. :)

1.3.4 Q4

Let there be 20 samples. Let X1 ∼ Bernoulli(0.3), let X2 ∼ Unif[−10, 10], let

X3 ∼ Normal(100, 10), let X4 = 2 ∗ X1 − X2 + 0.3X3, let X5 = 1 − X1, and

finally define X0 as the constant feature of 1’s. Please examine the eigen values

for the matrix XT X with the following definitions of X and report whether

(XT X)−1

exists.

Note, the notation below indicates combining the vectors by columns

1.4 Simultaneous Inference

1.4.1 Q5

Imagine Z ∼ Binomial(n = 100, p = 0.05), please report an 95% prediction

interval for Z. Note, by convention, prediction intervals centers the expected

value, but this is not technically required.

A 95% prediction interval is any interval such that, when predicting the

value of Z, will have a 95% chance of containing Z.

2

1.4.2 Q6

Let the sample size be 1000, Y ∼ Normal(0, 10) then create 99 random features

that are completely uncorrelated with Y . Please regress Y on these features and

report the number of significant features using point wise hypothesis

tests, i.e. |βˆiSEσˆ2 (βˆi| ≥ t(n − p, 97.5) would identify a significant feature.

Recall that SEσˆ2 (βˆi) = rhCov(βˆ|X)ii,i

where we use ˆσ2

to approximateσ2. Since you have the intercept, there should be a total of 100 tests being

performed.

1.4.3 Q7

Continuing from Q6, let us adjusted the problem by using the Bonferroni correction

to perform simultaneous inference. Please write the code that would

numerically show that the false positive rate from Bonferroni is at most 5%

over 1000 simulations. In other words, the probability of calling at least one

feature significant when all coefficients are 0, is upper bounded by 5%.

1.5 Interpreting your model

1.5.1 Q8

Let the sample size be 200. Define X ∼ Normal(10, 10), let Z ∼ Bernoulli(0.4),

and let Y = 5 − X + 2 ∗ Z + 3 ∗ X ∗ Z +  where  ∼ Unif[−3, 3]. Please run

the regression using all the data that includes the interaction effect and report

the coefficients, let’s call these βˆ. You should have 4 coefficients.

Side note, this is an example of how you can imagine data to be generated

from different groups that have different intercepts and slopes.

1.5.2 Q9

Please regress Y on X only using the values where Z = 0. Repeat this regression

only using the values where Z = 1. Report those coefficients, let’s call them βˆ0

and βˆ1 respectively.

1.5.3 Q10

Assume the parameters below refer to the coefficients from βˆ, βˆ

0 and βˆ

1. Please

answer the following:

• The intercept for βˆ equals which other parameter?

• The intercept for βˆ

1 is the sum between which 2 parameters?

• The slope for βˆ

0 is the same as which other parameter?

• The slope for βˆ

1 equals to the linear combination of which other parameters?

3

1.5.4 Q11

Q8-Q10 shows a case where we can obtain identical regression estimates by

regressing with interactions or by training 2 separate regression models, are the

standard errors for these estimates the same, yes/no?

A thought you should have: ”which method would you choose if someone

asked you to choose?” (No need to answer this question for Q11).

4


版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp