联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp

您当前位置:首页 >> Java编程Java编程

日期:2019-08-23 10:50

Final Exam for Statistical Learning

Summer 2019

Name____________ Student ID#_____________

Part I. (10 pts) True/False? In the following problems, determine whether the statement is true or

false. If it is false, please correct it. 1. In the ANOVA table for a linear regression model, the F statistic checks the

significance of the model. The F statistic follows the F distribution with

degrees of freedom K and n-K-1, where n is the sample size and K equals

the number of independent variables. _______

2. An application of the linear regression model with an intercept and 9

independent variables generated the following results involving the F test of the

overall regression model (in the ANOVA table): p - value = .03, R2 = .67, s = .076. Thus, the null hypothesis, , should be rejected

at the .05 level of significance. ______ Part II. (10 pts) Multiple choice questions. There is only one best answer among

alternatives. You must choose the best answer to each of the questions and circle it. 1. In a multiple regression model with a large sample of size 200 and 4

independent variables, , one likes to investigate if and

are useful for explaining the remaining variation of the response

after taking into account the effect of and , one uses F-statistic

to test against For the testing

problem in problem 1, what are degrees of freedom of the test statistic?____

a. 2, 196 c. 2, 195

b. 4, 196 d. 4, 195

c. 2, 200 e. 1, 195

2. For comparison of any two linear regression models (Model 1 and Model 2), the

following regression outputs are obtained:

(i) For Model 1: , F=13.5

(ii) For Model 2:

Then which of the followings is true? _______

a. Model 1 is more flexible than Model 2. b. Model 2 is more flexible than Model 1.

c. No enough evidence to decide which model is more flexible. Part III. (80 pts)

2. Bootstrap (25 pts). Given an iid random sample from the regression model , one uses

the KNN with K=5 to estimate the mean function . Let be the resulting estimator. How to use the bootstrap method to get a 95% confidence interval of ? Write down the

algorithm for calculating the interval estimate.

3. (25 pts) Solve the following problems:

4. Explain each of the following R codes (30 pts)

1. set.seed(1)

2. p = 20

3. n = 1000

4. x = matrix(rnorm(n * p), n, p)

5. B = rnorm(p)

6. B[3] = 0

7. B[4] = 0

8. B[9] = 0

9. B[19] = 0

10. B[10] = 0

11. eps = rnorm(p)

12. y = x %*% B + eps

13. train = sample(seq(1000), 100, replace = FALSE)

14. y.train = y[train, ]

15. y.test = y[-train, ]

16. x.train = x[train, ]

17. x.test = x[-train, ]

18. library(leaps)

19. regfit.full = regsubsets(y ~ ., data = data.frame(x =

x.train, y = y.train),

20. nvmax = p)

21. val.errors = rep(NA, p)

22. x_cols = colnames(x, do.NULL = FALSE, prefix = "x.")

23. for (i in 1:p) {

24. coefi = coef(regfit.full, id = i)

25. pred = as.matrix(x.train[, x_cols %in% names(coefi)]) %*%

coefi[names(coefi), x_cols]

26. val.errors[i] = mean((y.train - pred)^2)

27. }

28. plot(val.errors, ylab = "Training MSE", pch = 19, type =

"b")

(10 pts)


版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp