联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp

您当前位置:首页 >> Algorithm 算法作业Algorithm 算法作业

日期:2023-04-26 08:57

MATH38172 Generalised Linear Models

Coursework 2023

Instructions. Attempt the questions below and submit your work online via Blackboard by the deadline of

3pm on Friday 28th April 2023. Your submission must be a single file. It may contain any sensible mix of

word-processed and (scanned) handwritten parts, for example using LaTeX, RMarkdown or Microsoft Word.

You should include any R code used. A complete solution is possible in 5 pages of type 10 font; please limit

your response to 10 pages at most. The coursework may take up to 10 hours to complete. The submitted work

MUST be your own. Plagiarism will not be tolerated and will result in serious consequences if discovered.

Background. The file smokingdata.csv on Blackboard contains data on the relationship between smoking

and health for 1314 women in Northern England. The women are grouped according to the combination of

levels of two variables:

Age: the women’s age in the initial survey in the 1970s (categories: 18-25, 25-34, 35-44, 45-54, 55-64,

65-74, 75+);

Smoking: the women’s smoking status in the original survey (categories: NonSmoker, Smoker).

For example, one group is the set of women who were non-smokers aged 25-34. For each group, the following

two variables have been collected:

Alive: the number of women in the group that were still alive 20 years after the original survey;

Dead: the number of women in the group that were dead 20 years after the original survey.

Questions

1. Read the dataset into R. (1 mark)

2. (a) Fit a logistic regression model to explain the probability of death within 20 years, using smoking

status as the ONLY explanatory variable, and present the summary for the fitted model. (1 mark)

(b) Write down the fitted model in equation form and interpret its parameters, including the parameter

values. Do you notice anything unusual about the parameter estimates? (3 marks)

3. (a) Fit a logistic regression model to explain the probability of death within 20 years, using BOTH

age and smoking status as explanatory variables, and present the summary for the fitted model.

(1 mark)

(b) Write down the fitted model in equation form and interpret its parameters, including the parameter

values. (3 marks)

(c) Compare your answers in 2(b) and 3(b). What do you notice? What is the reason for any

dierences? (3 marks)

(d) Assess whether there is significant evidence that the probability of death depends on (i) smoking

status, or (ii) age. Give details of which tests are used, equations for the test statistic, critical

value, etc. (4 marks)

4. (a) Using the model you fitted in Question 3, estimate the probability of death within 20 years for a

woman aged 55-64 who does not smoke. (1 mark)

(b) Find a 95% confidence interval for the probability in 4(a). Explain your working. (3 marks)


相关文章

版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp