联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp

您当前位置:首页 >> Algorithm 算法作业Algorithm 算法作业

日期:2019-11-12 10:53

STAT 3675Q - STATISTICAL COMPUTING UCONN

Fall 2019 Marcos Prates

1. Objective

The IMDB Movies Dataset (file imdb.csv) contains information about over 10,000 movies.

The names of the first twelve columns are self-explanatory (the duration is in seconds). The

rest of the variables (Action, Adult, Adventure, . . .) are dummy variables (0/1) indicating

if the movie has the given genre.

In this project, you will apply a number of statistical methods that have been covered during

the course using R.

? Projects are to be completed individually, or with someone.

? The project is worth 25% of the final grade.

Directions. You are asked to write a preliminary report and a report. Please follow carefully

the following guidelines

2. Preliminary report [30 points]

? Provide a single file with the format name_3675_prelim.pdf (or name1_name2_3675_prelim.pdf

if you work with someone), where name is your full name.

? The preliminary report is due on Sunday, November 22, 2019 at 11:59 PM. Submit it

via HuskyCT. The pdf must be generated using Rmarkdown.

Your preliminary report must contain the following elements.

(a) A preliminary exploratory analysis including summary statistics and basic graphs (4

pages max)

(b) Pose scientific questions that are interesting to you and indicate what statistical methods

may help answer those questions (1 page)

(c) Include the R code and all outputs.

3. Report [70 points]

For the report, provide a single file with the format name_3675_report.pdf (or

name1_name2_3675_report.pdf), where name is your full name. The pdf must be

generated using Rmarkdown.

? The report must be at least 10 pages long, without exceeding 30 pages (including the

code and the graphs).

? The report is due on Sunday, December 8, 2019 at 11:59 PM. Submit it via HuskyCT.

1

(a) Include the preliminary report

(b) Include at least one regression method

(c) Include at least one ANOVA analysis

(d) Include at least one classification method

For each method,

? Express all statistical models using mathematical formulae, and clearly state the meaning

of the notations, and the assumptions.

? Insert R code and necessary comments. Your output must contain the R code (do not

use the echo=FALSE option).

? Interpret extensively all outputs and graphs that you include.

4. Important dates

? November 22, 2019: Preliminary report is due

? Decebmer 8, 2019: Report is due

2


版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp