联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp

您当前位置:首页 >> Algorithm 算法作业Algorithm 算法作业

日期:2020-03-06 08:39

GGR376

Assignment 2: Regression

44 Marks

Regression: Modelling the relationship between a response (or dependent variable) and one or

more explanatory variables (or independent variables). linear regression is a linear approach to

modelling the relationship.

Before completing the assignment, review the example R markdown tutorial and the videos.

NOTE: Join the spatial data at the beginning, as it causes issues to do it at the end.

Research Problem:

Produce an explanatory regression model for the variation in housing costs by census tract in the

City of Hamilton, Ontario, Canada.

Data:

Hamilton Census Tract boundaries, which includes the average house price and the unique

identifier: CTUID.

You can access the data with the following command and URL:

library(rgdal)

rgdal::readOGR("https://raw.githubusercontent.com/gisUTM/GGR376/master/Lab_1/houseValu

es.geojson")

You will need to obtain 10 potential explanatory variables from the 2016 Census Data, available

from CHASS: http://dc2.chass.utoronto.ca.myaccess.library.utoronto.ca/census/

Assignment Format:

The assignment submission will be composed of three files.

1. An R script of your code produced during the project, with the .R file extension.

2. A CSV file of the additional input data you utilized in your model (one table).

3. Answers to the questions listed below in a PDF file.

All three files must be submitted online.

Assignment Requirements:

• Ensure all procedures from the lab tutorial are replicated in your work.

• Fit and test 10 linear regression models.

o Example model names: model_1, model_2, etc.

o All models should remain in the code.

o Rename your final model: final_model

• The final model must meet all assumptions with the possible exemption:

o Independent errors due to spatial autocorrelation.

▪ Validate the independent errors assumption in your model with spatial

autoregressive modelling.

GRADING

R Script: 10 Marks

The script you submit should be fully reproducible, which means the TA should be able to run

your script without modification. The only allowable modification would be the file path for the

CSV file of your additional input variables. Review the R Script grading scale below.

The general structure of your R script should follow:

1. Data Munging:

a. Reading Data

b. Merging Data

2. Graphical Analysis Pre-Check

3. Data Transformations

4. Correlation Assessment

5. Model Fitting and model assumption assessment (10 models)

a. If one assumption is broken you can continue to the next model.

i. No need to test every assumption in that case

6. Spatial Autocorrelation Assessment

7. Spatial Autoregressive Modelling

R Script Grading:

10 / 10: The code is properly documented with comments and detailed variable names. No issues

are present in the code. A person versed in R should be able to read through the code in one

attempt.

9 / 10: The code is well documented. A single error, inconsistency, poor variable name or

documentation is present. A reviewer may need to make a single check of previous code to

interpret.

8 / 10: The code is documented. A couple errors, inconsistencies, poor variable names or

documentation is present. A reviewer may need to make multiple checks of previous code to

interpret.

7 / 10: The code is documented. A few errors, inconsistencies, poor variable names or

documentation is present. A reviewer needs to make multiple checks of previous code to

interpret but can understand all sections of the code.

6 / 10: The code is partially documented. Errors, inconsistencies, and poor variable names are

present. A reviewer needs to make multiple checks of previous code to interpret and may not

completely understand all sections of the code.

5 / 10: The code is sparsely documented. Many errors, inconsistencies, and poor variable names

present. A reviewer needs to make multiple checks of previous code to interpret and does not

completely understand all sections of the code.

4 or below: Many inconsistences in the code. It would not be able to be reproduced by another

researcher without many questions directed to the original author.

Missing assignment requirements in the code will also reduce your mark.

• Too few linear models in the code (-1 for each missing model)

• Final_model is not renamed (-1)

• Model Assumptions not tested (-1 for each assumption)

• Moran’s I not tested correctly (-2)

• Code will not run when tested (-3)

• Other errors will be penalized as appropriate.

To achieve a mark above 8, it is likely you would re-write your code after you have completed

working through the assignment to ensure clarity.

CSV File: 2 Marks

The CSV file should contain all the variables that you obtained from the Census for testing in

your model. It must contain 10 variables.

Questions (32 Marks)

All figures must include a figure caption.

1. Complete the following table. (1 Mark)

Variable Name

in CSV File Min Max Mean Variable Description

2. Complete the following table. (2 Marks)

Variable Name

in CSV File Reason why you selected the variable.

3. Produce a publication quality histogram of the dependent variable (transformed if you did a

transformation). (3 Marks)

4. Write 50 words on why you did or did not transform your dependent variable based on the

assumptions of the linear regression model. (2 Marks)

5. Describe in 200 words your process of model fitting. Address the selection of variables, how

you decided to remove or add variables, and the way you assessed each assumption. (4 Marks)

6. Complete the following table (2 Marks)

Model

Name R

2

p < 0.05

(Y/N) List Assumption(s) Violated or All assumptions met?

7. For your final linear regression model, produce a figure from the 4 plots generated by

plot(linear_model). (2 Marks)

8. Produce a publication quality figure of residuals vs fitted values for your final linear

regression model. (3 Marks)

9. Calculate Moran’s I for your residuals. Report in 50 words, your values for Moran’s I and how

you interpret these findings. (3 Marks)

10. Write 150 words interpreting your final linear regression model. (4 marks)

11. Would you require a spatial autoregressive model? Explain how you would have chosen the

model to use. (3 marks)

12. Produce a map of a spatial autoregressive model’s residuals. (3 Marks)


版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp