联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp

您当前位置:首页 >> C/C++编程C/C++编程

日期:2024-03-30 06:26

STATISTICS 360: Advanced R

December 18, 2023

FINAL PROJECT

POLICY

1. This project is to be completed independently. You may use whatever class materials you wish in completing this assignment. BUT DO NOT DISCUSS QUESTIONS OR RESULTS WITH ANYONE ELSE OUTSIDE OF YOUR GROUP, WITHIN OR OUTSIDE OF THE CLASS. Failure to follow this directive will result in a failing grade.

2. Late projects will be accepted at a penalty of 2 points/hour (it’s a 100 point project).

3. You are allowed to clarify the question requirements. But do not show or discuss your answer with the TA and instructor and seek comments from them.

4. You will form. a group of no more than 3 persons. You can also opt to do the project by yourself and earn bonus point for independence.

5. Regarding the R package, you are allowed to receive limited guided support of the implementation. The only venue/time for receiving project-specific guidance will be during the STAT 361 lab session.

6. We will show you how to submit your R package via Github and your written report (if any) in the STAT361 lab.

DELIVERABLES

You will produce an R package that implements the Multivariate Adaptive Regression Splines (MARS) algorithm; the R package should contain proper documentation, and test cases. More details on the requirements will be shared later. In addition, you may write a report if you choose to pursue some of the bonus sub-projects.

1. (60 points) Produce an R package that implements the Multivariate Adaptive Regres-sion Splines (MARS) algorithm, with proper documentation. The main fitting function will be developed in the class and lab exercises throughout the first half of the course.

You will need to add “methods”, tests and documentation. A grading rubric will be circulated around reading break.

2. (+5 points) Bonus: Finish the project by oneself

3. (+2.5 points) Bonus: Finish the project in a group of 2 persons

4. (+10 points) Bonus: Write a 2 to 5-page explantory note of how the fwd_stepwise() function in mars.R works. For example, you can explain what each R variable (e.g., Bfuncs) means, show a few case examples and step-by-step table deduction of how the algorithm works.

5. (+5 points) Bonus: Write a 2 to 5-page explantory note of how the backward elimina-tion Algorithm 3 in the paper (Friedman, 1991) works. For example, you can discuss the meaning of symbols, show examples of how the algorithm evolves, and add a few illustrations (as I did in the explantory videos on Algorithm 1 and 2).

6. (+15 points) Bonus: Implement backward selection algorithm bwd_stepwise() in R.

7. (+7.5 points) Bonus: Use Rcpp to implement the h() function in C++, and incorporate it into the R package.

REQUIREMENTS

1. The package should have a mars(formula, data, control) interface for fitting the model. It should behave similarly to lm()

2. You should not make any change to what’s already in the mar.R, e.g., the function name, the function argument, etc.

3. The R package should have proper documentation (to be detailed in the grading rubric).

4. There are some leeway with regards to how you prefer your output figures or printed message to look like in the methods such as plot, print, summary. You don’t have to follow exactly what I did in the project illustration video.

GUIDANCE

Here a step-by-step guide on where to start.

1. Have a read of the book chapter (MARS-ESL12.pdf).

2. Watch the project illustration video

3. Watch the lecture videos on projects overview and algorithm, ideally more than once for complete understanding of the algorothm.

4. Have a brief read of the papers (MARS-paper.pdf). Don’t panic if you find yourself not able to understand everything (you are not expected to anyway).

5. You will learn how to build an R package starting from week 4 to 5. If you prefer, you can also start early and learn from the manual on “How to create R extension”. See the course canvas page for details.

6. The lab sessions will also largely be focusing on R function used in the projects. It is extremely helpful to attend them.

HINT

1. The plot() includes plotting a 3D plot for any basis functions that involves precisely 2 variables. You can produce the 3D plot using graphics::persp()

2. The trace = TRUE or FALSE in the mars.control() is for enabling or disabling the tracking of each variable value during the running of the algorithm. When trace = TRUE, I prefer to output the values such as basis index, the lof value during each steps within the algorithm. And setting trace = FALSE will not print out anything during the fitting of MARS. You don’t have to follow exactly what the project illustration video did; basic output should suffice.

GRADES

A grading rubric will be distributed during the reading week. The grade accounts for 50% of your final grade.






版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp