联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp

您当前位置:首页 >> Algorithm 算法作业Algorithm 算法作业

日期:2018-04-26 12:47

Final Report – World Bank Health Data

Small Group Effort - 200 points

Instructions:

The final report is a professional team report on country-level fertility rates and factors that

influence fertility rates.

 The report should be written in Word, with key figures and tables placed in the document

to illustrate your narrative. Print/save in pdf format for submission. You are the data

analysis team, submitting a report to a policy expert. Pay close attention to formatting,

utilizing highlight boxes, bullets, headings, etc. appropriately.

 Though the report will include analyses similar to homework assignments, the emphasis

in this report is on presentation and interpretation.

 The report should only contain your conclusions, discussion and the supporting figures.

Don’t put any code or script output into the report, as it will be looked at by non-tech

people. You will submit all that as supplemental files instead (see below).

 This will be a group report, with individual contributions evaluated via anonymous peer

feedback. Team members can receive from 0-100% of the graded group report points

depending upon the extent of their contributions.

1. Familiarize yourself with the World Bank Health data “wbh.csv”, using the provided

descriptions of variables. Then:

(a) Subset the data for year 2010 only.

(b) Clean the data from NA values. First drop the columns whose NA rate is above 15%,

then remove rows with any NA values.

2. Address the possibility that bias was introduced through the refinement steps needed to

create the dataset for 2010.

(a) Is the subset of countries included in this dataset representative?

(b) Is 2010 a representative year?

3. Reduce the number of predictors in this dataset based upon an understanding of the structure

of this data.

(a) Use exploratory data analysis and unsupervised learning techniques to study the structure

of this dataset. Discuss in some depth.

(b) Select a subset of predictors (~10-15) that capture most of the information relevant for

the study of country-level fertility rates. Justify your decisions.

1

CPT_S/Stat 115 Oles/Ye Spring, 2018

(c) Construct a new dataset with the variables from b. (numeric, integer, categorical as

appropriate), and rename the variables with easily interpreted descriptors.

4. Construct, evaluate and interpret supervised learning models on the data subset from 3c.

(a) Ordinary least squares regression

(b) Decision trees

(c) Random forest

(d) Compare these three approaches for accuracy.

5. What did you learn?

(a) Based upon your analyses, what are the primary and secondary factors controlling

country-level fertility rates? Comment on the magnitude/direction of relationships.

(b) Mention any countries that are outliers, and discuss the possible reasons.

Submission details

Beside the report in PDF format, submit two supplementary files Appendix A and Appendix B

(see below), and the .Rmd file used to produce Appendix A. There are 4 files to submit in total.

Appendix A - Html file showing all of your analyses from a knitted .Rmd file.

Limit output so the report is readable (e.g. no long glimpse outputs).

Make sure there is no extraneous material left over from the lectures or your prior homework

assignments.

Appendix B – Detailed contributions. For each member of your team, provide a detailed

description of their contributions, specifying questions and subquestions as needed.

Final Report - Grading Rubric

2

CPT_S/Stat 115 Oles/Ye Spring, 2018

Component Excellent Acceptable Needs Improvement

Question 1 36-40 31-35 0-30

Question 2 26-30 21-25 0-20

Question 3 36-40 31-35 0-30

Question 4 36-40 31-35 0-30

Question 5 23-25 20-22 0-19

Quality of writing and report organization 23-25 20-22 0-19

Being a good team member:

 Give everyone a chance to participate.

 Don’t rush ahead and do everything yourself.

 Respond to your team’s emails and meeting requests.

 Do your share of the work, including discussion, analysis and writing.

Be respectful at all times! I don’t expect all contributions to be the same. Everyone has strengths

and weaknesses. But don’t let one person do all the writing, one do all the analysis and a third do

all the coding. Everyone should contribute to all aspects of the report!


版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp