 #### 联系方式

• QQ：99515681
• 邮箱：99515681@qq.com
• 工作时间：8:00-23:00
• 微信：codinghelp2 #### 您当前位置：首页 >> Algorithm 算法作业Algorithm 算法作业

###### 日期：2020-02-14 09:07

STAT 440: Homework 4 Due: 2/11 at 3:00pm

All work must be done using RMarkdown. Turn in the code as well as the output. Clearly denote the

results of each question! If the grader has a hard time finding your answer, I will instruct them to not give

you credit!

1. The Zero-Inflated Poisson distribution is useful for modeling count processes where there are additional

zero values. It is commonly used to model the counts of rare events, where most of the time there will

be no events. Let X ～ ZIP(p, λ) be a random variable from the zero-inflated Poisson distribution

with occurance probability p and rate λ. Then

P(X = i) = ((1 ? p) + p ・ e?λ, i = 0p ・λie?λi!, i = 1, 2, . . .

Random variables from a ZIP distribution can also be written as a function of two other random

variables. If X ～ ZIP(p, λ), then

X = Y ・ Z

where

Y ～ Bern(p)

Z ～ P ois(λ)

(a) Simulate 10000 iid random variables Xi from Xi ～ ZIP(0.3, 7) and plot a histogram of the

resulting random variables.

(b) Calculate the theoretical probabilities:

P(X = i) i = 0, 1, ..., 9

and compare them to the Monte Carlo estimates of these probabilities from your simulations.

(c) Estimate θ = E(X) where X ～ ZIP(p, λ). Use 1000 Monte Carlo samples to estimate θ, and

give a 95% confidence interval for your estimate.

2. Use Monte Carlo to estimate the integral

θ =Z 42(3x2 ? 2x ? 10)dx.

Perform this calculation m = 1000 times each for Monte Carlo sample sizes of n = 1000, n = 10, 000,

and n = 100, 000. For each n, plot a histogram of ?θ(n)1, . . . ,

?θ(n)m, and calculate the mean squared

error of the estimate,

where ?θ(n)i is the i

th MC estimate of θ for sample size n, and θ0 is the true value of the integral θ.

3. In the last HW, you estimated the conditional moment of the standard normal distribution: Z ～ N(0, 1)

θα = E[Z|Z > α]

using Monte Carlo. Now you will do the same using importance sampling.

(a) First, use the efficient sampler you wrote in your last HW (or use the one in the posted solutions)

to estimate θ4.5 using Monte Carlo with 1,000 MC samples. Report the estimate ?θ4.5, the time it

took to compute the estimator (most of this time will be spent drawing the Z|Z > α), and your

standard error.

STAT 440: Homework 4 Due: 2/11 at 3:00pm

(b) Now you will estimate θ4.5 using an importance sampler. First, use a N(μ, σ2) as your proposal

distribution. Show to to write the expectation ?θ4.5 as an expectation with respect to the N(μ, σ2)

distribution.

(c) Write the density of Z|Z > α in terms of the normal pdf and the normal cdf. Implement this

density as an R function. Use your function to plot this density for enough values of z between -1

and 10 to make the plot look smooth.

(d) Implement an importance sampler using a N(μ, σ2) as the proposal density. Again, use 1,000

samples. Estimate θ4.5 using your importance sampler for a few different values of μ and σ2, and

calculate the standard error. For the best values of μ and σ2

that you find, plot the corresponding

density on top of your plot of the density of Z|Z > α, and report ?θ4.5, how long it took to compute

the estimator, and the standard error.

(e) Now try a different proposal density. Write another importance sampler that uses an Exp(λ = α)

proposal density. Run your new importance sampler using 1,000 samples, and report ?θ4.5, how

long it took to compute the estimator, and the standard error.

(f) Make a table that summarizes your three estimators of θ4.5. This table should contain the point

estimate, the running time, and the standard error. Which estimator to you think is the best?

4. Consider an iid sample X1, . . . , Xn of Bernoulli random variables with success parameter p.

(a) Write down the likelihood function L(p) and the log-likelihood `(p).

(b) Find the maximum likelihood estimator ?p.

(c) Using the asymptotic theory of MLEs, what is the asymptotic distribution of ?p?

(d) Instead of using the theory of MLEs, use the CLT to find the asymptotic distribution of ?p.

(e) What is the nonasymptotic distribution of ?p? (that is, you can find the exact distribution of ?p)

(f) In this part we want to visualize the sampling distribution of ?p. Suppose that p = 0.95. For

n = 10, 100, and 1000 generate 10000 Monte Carlo samples and for each n construct a historgram

of the resulting ?p. Overlay the asymptotic distributions and the exact distributions (that you

derived previously) onto the histograms (note that the exact distribution will look like a step

function).