联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp

您当前位置:首页 >> Algorithm 算法作业Algorithm 算法作业

日期:2022-04-27 09:20

MAS61006 Assessed Project

This project counts for 40% of the assessment for MAS61006.

1 Aim

The aim of this project is to assess you on the Bayesian modelling via computational methods skills

that you have learned on this module. Exploration and choice of appropriate modelling approach, as

well as how you can disseminate your Bayesian inference to a general audience are key elements of

this assessment.

2 Background

You are a statistician working with a pharmaceutical company who market a birth control drug.

Your involvement with the client is to assist them in better understanding their potential customer

base by investigating a range of demographic variables that may be linked to the uptake of birth

control by a woman. Primary interest is in:

Identifying the key demographic variables that have an effect on birth control use,

Quantifying any such demographic variable effect, and

Predicting the chance of certain demographic groups purchasing birth control in the future (to

know their key marketing groups).

There is a single deliverable for this project, in the form of a written report.

3 Data

The data made available to you by the client are the results of a market research investigation.

This is called birth-control-data.csv, and is available on the course Blackboard page. This data

contains information on 1,934 women regarding the following variables:

birthControl: an binary (0/1) response for whether the subject uses birth control (1 encoded

as use of birth control and 0 as not),

region: a factor variable describing the primary care region that the subject belongs within.

Note that this market research involved 60 care regions, which does not cover the full range of

the company’s target market,

homeStyle: a factor variable indicating whether the subject lives in a rural or urban area (0 is

encoded as rural, and 1 as urban),

children: the number of children the subject has. Note that the average number of children

an individual has in this market research study is 2.65,

age: the age of the subject. Note that this variable has been standardised, so that the average

age in this study is 0,

wealth: the financial wealth of the subject. Note that this variable has been standardised, so

that the average wealth measure in this study is 0.

1

You can import this data to R using the read_csv() function in the usual way.

4 Scope of analysis

The requirements of this analysis are:

1. Explore the results of the market research data, focusing on the outcome variable birthControl

with respect to the remaining explanatory variables. Production of appropriate figures/tables

to summarise this.

2. Carry out a Bayesian regression-based analysis for the outcome variable birthControl. You

should include justification as to your chosen regression model and the form of the linear

predictor involved in this model. You may implement ‘improper’/uninformative priors for your

parameters of interest. Your Bayesian inference should be implemented in Stan, and you are

expected to write your Stan model yourself (i.e. you should not use the brms package).

3. Check the convergence diagnostics of your inference approach.

4. Present your inference findings (potentially in graphical or tabular form), and disseminate this

information to a general reader. The primary focus of the client is quantifying the effects of

the demographic variables on the use of birth control.

To obtain a distinction your analysis should include (in addition to the above):

5. A ‘proper’ prior distribution, along with a brief check of the effect of this prior information on

your posterior inferences. Note that you should not do two separate regression analyses: do a

single regression analysis with your chosen prior.

6. Posterior predictions of the birth control use by an average woman who lives in an urban

setting but an unknown care region.

7. Out of the 60 primary care regions included in the marketing study, the two with the largest

population sizes are regions 1 and 14. The marketing department are interested in the which

of these two largest regions would be best to expend their marketing efforts in over the future.

Compare the posterior predictions of the birth control use by an average woman who lives in

an urban setting in each of the regions 1 and 14.

5 The written report

Your report must be prepared using R Markdown, using the template (template.Rmd) provided.

Do not modify the YAML, apart from inserting your own registration number in the author field.

There is a page limit of 6 pages for requirements 1-4 above, and 8 pages if you are also completing

requirements 5-7. There is no need for a title page or table of contents. These page limits include

everything.

Your report should not contain any R code, but you should submit your .Rmd file alongside your

PDF report. You should write your report with an intended audience of a client with the same

knowledge as another student on this course. You therefore can assume a working knowledge of

regression models, but should explain your results clearly. The grading of your project will give

equal weighting to:

The presentation/communication within your report, and

The technical content of your analysis.

The body of the report should be structured with the following sections:

2

1. Introduction

Begin with a short summary, including background and the objectives of the investigation.

Outline briefly the structure of the remaining report.

2. Methods (split into subsections if appropriate)

In this section you should include a short exploratory data analysis of the market research

data.

Plots and tables should be presented to a high standard, with properly labelled axes,

suitable sizing, and captions that include a conclusion.

This exploration should conclude with your chosen modelling approach (and justification

for such, given your findings).

3. Results (split into subsections if appropriate)

In this section you should present your findings.

Plots and tables should be presented to a high standard, with properly labelled axes,

suitable sizing, and captions that include a conclusion.

4. Conclusions and discussion

State your conclusions, and explain how they are justified based on your methods and

results.

6 Unfair Means

Your project submission must be entirely your own work: do not discuss your project with anyone

apart from staff teaching this module. If you haven’t already done so, you should work through the

tutorial on unfair means available on the MSc Statistics organisation page on Blackboard.

7 Submitting your work

Upload both your pdf and Rmd file using the Assignment Dropbox on Blackboard. Use the file

names

MAS61006ProjectReportxxxxxxxxxx.pdf

MAS61000ProjectCodexxxxxxxxx.Rmd

replacing xxxxxxxxx with your student registration number.


相关文章

版权所有:编程辅导网 2020 All Rights Reserved 联系方式:QQ:99515681 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。