联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp

您当前位置:首页 >> Algorithm 算法作业Algorithm 算法作业

日期:2019-10-22 11:15

Group project and presentation

Marks

Report – 15%

Presentation – 15%

Project Task Marks

1 BostonHousing.csv data contains these variables:

In the data set, the outcome variable is the MEDV (The median value of owneroccupied

homes in $1000s). The average MEDV is assumed to be approximately

normally distributed as a function of the covariates.

As a data analyst for a real estate company, you are given the task to model the

variables that are related to the MEDV. The findings are useful to help your

company understand the factors that may influence the median value of

homes. In the long run, the company plan to exercise the understanding from

your model to select and choose the house that can bring in the maximum

profit.

Write and present a report that includes the objectives of the analysis, the

methods for analysis, the results from the analysis, the interpretation and the

conclusion from the analysis.

Refer to attachment 1 for the format of report and presentation.

8%

2

A major book store collected data to understand the factors that may influence

the purchase of a new book ‘The art history of Florence’. The dataset was

named CharlesBookClub.csv. It contain these variables :

Variables R, F and M refer to:

R = recency, time since last purchase

F = frequency, number of previous purchases from the company over a period

M = monetary, amount of money spent on the company’s products over a

Period

And the codes of variable Recency, Frequency and Monetary were coded as

below:

Recency:

0–2 months (Rcode = 1)

3–6 months (Rcode = 2)

7–12 months (Rcode = 3)

13 months and up (Rcode = 4)

Frequency:

1 book (Fcode = l)

2 books (Fcode = 2)

3 books and up (Fcode = 3)

Monetary:

$0–$25 (Mcode = 1)

$26–$50 (Mcode = 2)

$51–$100 (Mcode = 3)

$101–$200 (Mcode = 4)

$201 and up (Mcode = 5)

7%

The outcome of the variable is Florence and it is coded as either 1 (the Art

History of Florence was bought) and 0 if it was not.

Note: You need to convert this variable to a factor variable before analysis in R.

Use the dataset to run three separate regression models:

a. The full set of predictors in the dataset (exclude Seq and ID)

b. A subset of predictors that you judge to be the best

c. Only the R, F, and M variables

Write and present a report that includes the objectives of the analysis, the

methods for analysis, the results from the analysis, the interpretation and the

conclusion from the analysis.

Refer to attachment 1 for the format of report and presentation.

Attachment 1:

Instruction for the writing of the report:

1. Write a short report with for each project.

2. The report should contain these headings: Introduction, Methods, Results and Discussion

(IMRaD) format.

3. Introduction

a. Briefly write the introduction for the project

b. State the rationale of doing the analysis

c. List the appropriate study objectives

4. Methods

a. Describe the plan of data wrangling, data visualization

b. Describe the plan for statistical analysis

5. Results

a. Present the results of the analysis

b. Write the interpretation of the analysis

6. Discussion

a. Discuss the application of your results

b. Summarize your findings

7. The report should not be longer than 4 pages of A4 paper (max 2 pages for project 1 and

max 2 pages for project 2).

R codes *expand the codes or use other relevant codes as required

library(tidyverse)

library(broom)

Project 1:

summary()

mymod <- lm(outcome ~ covariates, data = yourdata)

tidy(mymod)

Project 2:

yourdata <- yourdata %>% mutate(outcome = factor(outcome))

mymod2 <- glm(outcome ~ covariate, family = binomial, data = yourdata)

tidy(mymod2)


版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp