联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp

您当前位置:首页 >> Python编程Python编程

日期:2018-12-01 10:07

1. Consider the baseball dataset describing the population of baseball playersin the data file

baseball.csv. Set the seed of your randomization to be the 2313.

(a) Take a stratified randomsample of 150 players, using proportional allocationwith the different

teams as strata (teams are in column 1 of the data file). Describe how you selected the sample.

2

(b) Find the mean of the variable logsal = ln(salary), using your stratified sample, and give a 95%

CI.

(c) Estimate the proportion of players in the data set who are pitchers, using yourstratified sample,

and give a 95% CI.

(d) Take a simple random sample of 150 players and repeat part (c). How does your estimate

compare with that of part (c).

(e) Examine the sample variances of logsal in each stratum. Do you think optimal allocation would

be worthwhile for this problem?

(f) Using the sample variances from (e) to estimate the population stratum variances, determine

the optimal allocation for a sample in which the cost is the same in each stratum and the total

sample size is 150. How much does the optimal allocation differ from proportional allocation

for this scenario?

2. Use the population data set hh18.csv with N = 251 pairs of measurements of height, x

and handspan, y from our class to mainly compare regression and ratio estimation for estimating

the mean handspan μy, using information from a sample of size n =10. Set the seed of your

randomization to be the 1234.

(a) Compute a SRS estimator, a ratio estimator and a regression-based estimator of the population

mean handspan μy.

(b) Find the error of estimation, |μ μy| for each of the three estimators in part (a) and compare

them.

(c) Compute and compare the estimated variances of the three estimates.

3. A market research firm constructed a sampling plan to estimate the weekly sales of brand

A cereal in a certain geographic area. The firm decided to sample cities within the area and then to

sample supermarkets within cities. The number of boxes of brand A cereal sold in a specified week

is the measurement of interest. Five cities are sampled from the 20 in the area. Using the data given

in the accompanying table, answer the following:

City

Number of

supermarkets

Supermarkets

sampled yˉi s2

i

1 45 9 102 20

2 36 7 90 16

3 20 4 76 22

4 18 4 94 26

5 28 6 120 12

(a) Estimate the average sales for the week for all supermarkets in the area. Place a bound on the

error of the estimation. Is the estimator you used unbiased?

(b) Do you have enough information to estimate the total number of boxes of cereal sold by all

supermarkets in the area during the week? If so, explain how you would estimate this total,

and place a bound on the error of estimation.


版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp