1. A random walk board game is played as follows. A rook starts on the bottom row
of a chessboard that is infinite in the left, right and upwards directions. At every
step the rook moves randomly one square up, down, left, or right, where each of
the four directions is equally likely, independent of all other moves. When the rook
falls of the bottom of the board, its x coordinate is noted (relative to the initial
position which is x = 0).
This game has been played 29172 times for you and the results collected in the file
values.txt.
(a) Apply two different methods to evaluate whether the x values recorded follow
a roughly normal distribution.
(b) Does the distribution appear to have heavy tails or light tails? Discuss any
features of the random walk game which in your view would tend to cause this
property.
(c) Does the distribution of x values appear symmetric from the data? What about
from the description of the game? If there appears to be any discrepancy here,
how do you resolve it?
2. A hundred fish were captured at random in a lake and their weights in grams noted.
The data is in the file fish.txt.
(a) Obtain a histogram of the data set. Describe the shape and modality of the
data, as well as any skew.
(b) Apply two standard methods to evaluate heuristically whether the data seem
to be close to normal (i.e., drawn from an approximately normal distribution).
Make sure to include any R input and plots generated in your write-up.
(c) We are interested the true mean weight μ of all the fish in the lake. Describe
the approximate sampling distribution of X, the sample mean, in terms of
properties of the lake’s fish population. Explain any assumptions underlying
this approximation.
(d) Now calculate an approximate 95% confidence interval for μ on the basis of the
data in fish.txt and the sampling distribution you described in (c). Discuss
whether or not your answer to (b) undermines the validity of the confidence
interval you have calculated.
(e) Can you think of any practical issues with how the fish were captured that
might call into question whether X is an unbiased estimator of the true mean μ?
2
3. If you are waiting some number of turns for a success that happens with a known
probability p, the expected number of turns until success is exactly 1/p, as we know
from our study of geometric random variables. If you have already been waiting
some time with no success, the expected additional waiting time until success is still
1/p.
If, however, success does not happen with the same probability at each turn (i.e., if
the turns are not i.i.d.), it is possible that the longer you have already been waiting,
the longer you will typically to have to wait in addition. This question describes a
simple model that can produce such an e?ect.
Consider a bag that contains one blue ball and one red ball. Each turn, you take
out a ball chosen uniformly at random from the bag, and return it to the bag along
with another (new) ball of the same colour. So after n turns there will always be
2 + n balls in the bag.
(a) Give an exact expression for the probability that the first time you draw a red
ball from the bag is on turn number n. (You do not have to simplify this.)
(b) In the Week 6 Tutorial you learned how to carry out a loop in R in order to
perform a random experiment that depends on an unknown number of turns.
Use a similar method to write a loop that simulates the number of turns taken
until the first red ball is drawn from the bag. Make sure to show your full R
code for the loop.
(c) Now modify your loop code to simulate the number of turns taken until the first
red ball is drawn from the bag, but this time where the bag starts with 10001
blue balls and one red ball. Run this code five times (to do this, remember you
may put multiple statements all on one line separated by semicolons in order
to simplify matters). Also run the code from part (b) five times.
(d) Suppose that we start with one red ball and one blue ball in the bag. Does it
appear from part (c) that you would typically have to wait less additional time
for the first red ball from this starting position than if you have already waited
10000 turns without finding a red ball? Suggest a statistical testing method
that may be able to verify this proposition. (You do not need to carry out the
test, but make sure to engage with the issues around appropriate assumptions
for whichever test you choose.)
3
4. Five men and six women were sampled uniformly at random from the population.
Their heights in centimetres were as follows:
Let μM and μW be the mean height of all men and women respectively in the
population. Assume that the true variance of the mens’ heights and the true variance
of the womens’ heights are equal.
(a) Formulate a test (at the 5% significance level) for the null hypothesis
H0 : μM = μW
against the alternative hypothesis
HA : μM 6= μW .
Explain all assumptions made for this test and why they are reasonable to
assume.
(b) Write down the general formula for the test statistic and its distribution under
the null hypothesis.
(c) Compute the observed value of the test statistic, obtain an exact p-value (using
R) and write your conclusion in a sentence, being careful how you phrase it.
(d) Finally, suppose that we do away with the equal-variance assumption. Perform
a Welch’s t-test (with the conservative choice for degrees-of-freedom), again at
the 5% significance level, for the hypothesis
H0 : μM = μW
against the alternative hypothesis
HA : μM 6= μW .
Compare the conclusion with that obtained earlier under the equal-variance
assumption and discuss any difference.
4
版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。