联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp

您当前位置:首页 >> Web作业Web作业

日期:2018-09-30 05:53

Homework Three

Karla Ballman

26 September 2018

DUE DATE

This assignment is due at noon on Wednesday, October 3, 2018.

Human papillomavirus (HPV) is the most common sexually transmitted infection in the United States. Some

HPV types can cause genital warts and are considered low risk, with a small chance for causing cancer. Other

types are considered high risk, causing cancer in different areas of the body including the cervix and vagina

in women, penis in men, and anus and oropharynx in both men and women. A report provides the most

recent national estimates of genital HPV prevalence among adults aged 18–59 from the National Health and

Nutrition Examination Survey (NHANES) 2013–2014. During 2013–2014, prevalence of any genital HPV was

42.5% among adults aged 18–59. This information applies to Questions 1-4.

Question 1: Sampling distribution for a sample proportion

1(a) If we use a sample proportion, P, based on a sample of size n = 20, to estimate the

population proportion, π = 0.425, would it be very surprising to get an estimate that is off

by more than 0.10 (that is, the sample proportion is less than 0.325 or greater than 0.525)?

Support your answer.

1(b) If we use a sample proportion, P, based on a sample of size n = 100, to estimate the

population proportion, π = 0.425, would it be very surprising to get an estimate that is off by

more than 0.10? Support your answer.

1(c) If we use a sample proportion, P, based on a sample of size n = 500, to estimate the

population proportion, π = 0.425, would it be very surprising to get an estimate that is off by

more than 0.10? Support your answer.

1(d) If we use a sample proportion, P, based on a sample of size n = 20, to estimate the

population proportion, π = 0.425, would it be very surprising to get an estimate that is off

by more than 0.05 (that is, the sample proportion is less than 0.375 or greater than 0.475)?

Support your answer.

1(e) If we use a sample proportion, P, based on a sample of size n = 100, to estimate the

population proportion, π = 0.425, would it be very surprising to get an estimate that is off by

more than 0.05? Support your answer.

1(f) If we use a sample proportion, P, based on a sample of size n = 500, to estimate the

population proportion, π = 0.425, would it be very surprising to get an estimate that is off by

more than 0.05? Support your answer.

1(g) Using parts information obtained in your answers above, comment on the effect that

sample size has on the accuracy of an estimate.

Question 2: Maximum likelihood

Suppose we sample of size of n = 222 random adult individuals age 18-59 and determine that 98 individuals

were infected with genital HPV.

1

2(a) Draw the liklihood function for an estimate of the population parameter as a function of

π (with sample size n = 222).

2(b) What is the maximum likelihood estimate of the population proportion based on the

observed data?

2(c) Suppose that the sample size is increased to n = 444 random adults between the age of 18-

59 and observe that 196 individuals were infected with gential HPV. How does the likelihood

function change from that in 2(a)? Explain and/or show.

2(d) What is the maximum likelihood estimate of the population proportion based on the

observed data in 2(c)? How does it compare to the estimate in 2(b)?

Question 3: Confidence interval for HPV

Suppose that you obtain a random sample of adult individuals age 18-59 and determine whether they were

infected with genital HPV (π = 0.425). Use the R function binom.test().

3(a) Use R to get a random sample of n = 20 from adults 18-59. Determine the width of the

exact 99% confidence interval for π for your sample. To get the width, you would subtract

the lower bound of the 99% confidence interval from the upper bound of the 99% confidence

interval.

3(b) Use R to get a random sample of n = 100 from adults 18-59. Determine the width of the

exact 99% confidence interval for π for your sample.

3(c) Use R to get a random sample of n = 500 from adults age 18-59. Determine the width of

the exact 99% confidence interval for π for your sample.

3(d) How do the widths of your intervals in 3(a) - 3(c) compare?

Question 4: Hypothesis test for HPV

Suppose that you obtain a random sample of n = 123 adult individuals age 18-59 and determine whether

they were infected with genital HPV. Use the R function binom.test().

4(a) Use R to get a random sample of n = 123 from a population of adults age 18-59 with

π = 0.405. Report the sample proportion and perform a test to determine whether it differs

from π = 0.425

4(b) Use R to get a random sample of n = 123 from a population of adults age 18-59 with

π = 0.445. Report the sample proportion and perform a test to determine whether it differs

from π = 0.425

4(c) Use R to get a random sample of n = 123 from a population of adults age 18-59 with

π = 0.385. Report the sample proportion and perform a test to determine whether it differs

from π = 0.425

4(d) Use R to get a random sample of n = 123 from a population of adults age 18-59 with

π = 0.465. Report the sample proportion and perform a test to determine whether it differs

from π = 0.425

4(e) Use R to get a random sample of n = 123 from a population of adults age 18-59 with

π = 0.365. Report the sample proportion and perform a test to determine whether it differs

from π = 0.425

2

4(f) Use R to get a random sample of n = 123 from a population of adults age 18-59 with

π = 0.485. Report the sample proportion and perform a test to determine whether it differs

from π = 0.425

4(g) Using parts information obtained in your answers above, comment on what you observe

about the relationship between the sample proporiton, the hypothesized π, and the p-value.

Question 5: Normal approximation to the binomial

1 out of 4 women that smoke will die from smoking related diseases.

5(a) What is the probability that out of 1,000 women smokers more than 300 will die from

smoking related diseases? Use a normal approximation to the binomial to estimate this quantity

and compare it to the exact probability from a binomial.

5(b) What is the probability that the death from smoking related disease will be between 238

and 245 cases in 1000 random women smokers? Again, compare the estimate obtained from a

normal distribution to that obtained from the binomial distribution.

5(c) What is the probability that out of 1,000 women smokers the sample proportion of women

who will die from smoking related diseases is greater than will be greater than 0.30? Use a

normal approximation to the binomial to estimate this quantity and compare it to the exact

probability from a binomial.

5(d) What is the probability that the sample proportion of death from smoking related disease

will be between 0.238 and 0.245 cases in 1000 random women smokers? Again, compare the

estimate obtained from a normal distribution to that obtained from the binomial distribution.

5(e) What is the probability that out of 24 women smokers less than 6 will die from smoking

related diseases? Use a normal approximation to the binomial to estimate this quantity and

compare it to the exact probability from a binomial.

5(f) What is the probability that out of 24 women smokers between 5 and 9 will die from smoking

related diseases? Use a normal approximation to the binomial to estimate this quantity

and compare it to the exact probability from a binomial.

5(g) Which normal approximations to the binomial above were good? Which normal approximations

to the binomal distribution above were poor? How do you explain this?

Question 6: Confidence interval data analysis

Colon cancer is the most common gastrointestinal malignancy and the second-leading cause of cancer death

in the United States. Approximately 80% of colon cancer patients present with resectable, localized disease,

and in these patients, nodal metastases have long been recognized as the most important factor predicting

long-term survival. Nodal involvement is an important determinant in the decision to administer adjuvant

chemotherapy, and with the demonstration over the last decade of highly effective systemic therapies for colon

cancer, it is essential to ensure that all patients who would benefit from such treatment receive counseling

concerning these therapies and have access to them. Numerous studies have shown an improvement in

disease-specific and overall survival when increasing numbers of lymph nodes are examined for colon cancer.

There has been a considerable effort to determine the minimum number of nodes that need to be evaluated

to deem a patient free of nodal metastases with reasonable certainty. Estimates have varied from 6 to 40

lymph nodes; however, numerous studies and consensus guidelines have suggested that examination of 12

regional lymph nodes is a reasonable minimum for adequate nodal evaluation for colon cancer. Despite these

findings, population-based assessments have shown that the majority of patients in the United States do not

have 12 or more nodes examined.

3

The American College of Surgeons (ACoS), American Society of Clinical Oncology (ASCO), and the National

Comprehensive Cancer Network (NCCN) harmonized a quality measure requiring resection and pathological

examination of 12 or more lymph nodes for colon cancer. Subsequently, the National Quality Forum

(NQF) endorsed the 12-node measure for quality surveillance. A large, national health-care insurer, United

Healthcare, is already basing referral recommendations for colectomy on a one-time requirement that surgeons

provide pathology reports demonstrating examination of 12 or more lymph nodes for colon cancer. These

organizations deem a hospital to be compliant with the 12-node measure if examination of at least 12 nodes

occurred for at least 75% of patients at that hospital. Due to statistical variation, hospitals were considered

“statistically compliant” if the upper limit of the (two-sided) 95% confidence interval (CI) of the estimate of

their performance rate was greater than or equal to 75%.

Dr. Smith has requested that you analyze the data provided to determine whether the hospital

currently meets this cancer care quality metric for node examination in colorectal cancer

patients. Write a short paragraph that summarizes your findings and answers Dr. Smith’s request.

DATA The data are in a file called crc.csv

BMI: an indicator whether a person has a normal BMI (BMI less than or equal to 25) or an overweight

BMI (BMI greater than 25)

Nodes: the number of lymph nodes examined

Year: year in which the patient underwent her/his colorectal cancer surgery

Question 7: Hypothesis test data analysis

Hopevale Hospital wants to understand the quality of care for their breast cancer patients. In particular,

the quality measurement in which they are most interested is the reoperation rate of women who undergo

lumpectomy. The leadership is interested in how their lumpectomy reoperation rates in women who are newly

diagnosed with breast cancer compares to their peer institutions. They know that the average reoperation

rate for a set of their peer institutions is 15%. They would like you to look at the data they provide on 700

procedures and tell them whether they meet or exceed this.

Data The data are in a file called BreastCancerOutcome.csv

size: the size of the cancer in cm

reoperation: indicates whether the patient had to undergo a reoperation

7(a) Is there evidence that the reoperation rate at Hopevale Hospital differs from that of their

peer institutions? Use a significance level of 0.05. Perform an hypothesis test. Write your answer in

a short paragraph that includes the p-value and an indication of how the rate differs (if it is statistically

significant).

7(b) The hospital also collected data on the size of the tumor. It is known that reoperation

rates likely differs for different size tumors. Is there evidence that the reoperation rates for

the different size tumors differs from that of Hopevale Hospital peer institutions? Use a

significance level of 0.05. Perform two hypothesis tests, one for each tumor size category. Summarize

your findings in a paragraph.

7(c) What is something that would be important to know regarding the tumor size at the peer

hospitals?



版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp