联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp

您当前位置:首页 >> Java编程Java编程

日期:2019-10-22 11:14

Lab 2: Monte Carlo integration and importance

sampling

Nial Friel

5 October 2019

Aim of this lab

This practical is focused on implementing Monte Carlo integration and the importance sampling algorithm

for two examples.

Monte Carlo integration and importance sampling

Recall that we are interested to estimate the integral

Using (ordinary) Monte Carlo we can estimate θ as

where x1, . . . , xn are iid draws from f(x).

Similarly, we can estimate θ using importance sampling. Recall that the key idea here is to express the

expectation wrt to f as one wrt an importance density g. So that one may then estimate the quantity of

interest, θ, using Monte Carlo by drawing from g. We can explain this idea neatly as:

where w(x) = f(x)/g(x). We can now produce an importance sampling estimate,

where now x1, . . . , xn are iid draws from g(x).

Note that importance sampling can prove useful in situations where we can’t sample from f, but crucially,

also in situations where sampling from g can lead to a reduced variance estimate of θ.

Here we first write a generic function to carry out importance sampling following the notation above.

importance.sampling <- function(f, phi, g, random.g, n) {

# Aim: Estimate Integral of phi(x)*f(x) dx, ie, E_f phi(X)

# f: density which we wish to sample from (or take expectation wrt)

# phi: function which we take expectation of

# g: density which we can sample from

# rg: random draw from density g.

# n: number of samples drawn from g.

1

y <- random.g(n)

mean(phi(y)*f(y)/g(y))

}

Rejection sampling from a gamma distribution

We now implement this code to simulate from a gamma distribution. First, recall from labs 1 that we can

express Y ∼ Ga(k, λ) with k ∈ R as

Y = X1 + · · · + Xk

where Xi

iid ∼ Exp(λ). We can use this fact to simulate from a Ga(k, λ) distributions with integer-valued

k > 0:

inv.gamma.int <- function(n,k,lambda) {

x <- matrix(inv.exp(n=n*k,lambda=lambda),ncol=k)

apply(x,1,sum)

}

using the inv.exp function which we also developed in lab 1 to simulate from an exponential distribution:

inv.exp <- function(n,lambda){

u <- runif(n)

x <- -log(1-u)/lambda

x

}

The method described above for drawing from a Ga(k, λ) distribution only works for positive integers k.

However, suppose we would like to sample from X ∼ Ga(α, λ), for α, λ > 1, with density function f(x) to

estimate Ef (X).

Instead of sampling from f, we could use importance sampling with a Ga(k, λ − 1) distribution with k = bαc

as an importance function g(x). The symbol bαc means the largest integer less than α. This can be evaluated

in R using floor(α).

One of the key requirements for an importance function g(x) is that it dominates f(x). This means that

g(x) > 0 whenever f(x) = 0. In the context of f(x) being the density of a Ga(α, λ), distribution, for α, λ > 1

and g(x) being the density of a Ga(k, λ−1) distribution with k = bαc, it is easy to show that this requirement

holds.

Since α ≤ k and λ > 1, this implies that w(x) > 0 for the entire range of x > 0.

Let’s now implement importance sampling to estimate E(X) where X ∼ Ga(3.4, 3) using using the importance

sampling strategy described above. We can do this as follows:

f <- function(x) dgamma(x, 3.4, rate=3) # alpha=3.4, lambda=3

phi <- function(x) x

g <- function(x) dgamma(x,3,rate=2) # k=floor(alpha)=3, lambda=2

random.g <- function(n) inv.gamma.int(n,3,2)

y <- importance.sampling(f,phi,g,random.g,1e4)y2## [1] 1.133229

This agrees well with the true value of 3.4/3 = 1.333.

Calculating the tail probabilty of a Cauchy distribution

Recall the following example in the lecture notes:

Estimate the probability θ = P(X > 2) where X follows a Cauchy distribution with density

f(x) = 1π(1 + x2), x ∈ R.

Using our usual strategy we express this as an expectation as follows.

Z ∞2f(x) dx =Z

IA(x)f(x) dx = Ef IA(X),

where IA(x) is the indicator function taking the value 1 if x ∈ A = {x : x > 2}. Therefore we can estimate θ

using (ordinary) Monte Carlo asˆθn(f) = 1n

Xni=1IA(xi),

where x1, . . . , xn are drawn iid from f(x). We implement this as:

mc = mean( rcauchy(1e4)>2 ) # (ordinary) Monte Carlo estimator

mc

## [1] 0.1449

We can implement importance sampling (as we did in lectures) using an importance function derived as

follows. Observe that for large x, f(x) is approximately equal to 1/(πx2). If we normalise this function so

that it is a density function with support on [2, ∞) we get

g(x) = 2x2, x > 2.

As we’ve seen in lectures, we can sample from g(x) using the inversion method:

random.g <- function(n){u <- runif(n)x <- 2/ux}

This now leads to an importance sampling estimator

where x1, . . . , xn are an iid sample drawn from g(x), where xi = 2/ui and ui ∼ U(0, 1), for i = 1, . . . , n. We

are now in a position to implement importance sampling as

f <- function(x) dcauchy(x)

phi <- function(x) x>2

g <- function(x) 2*x^(-2)

isamp <- importance.sampling(f,phi,g,random.g,1e4)

isamp

3

## [1] 0.1476665

This compares well to the true value of θ = P(X > 2) is 0.5 − π

−1

tan 2 = 0.1476. Compare this to the

corresponding estimator above based on (ordinary) Monte Carlo.

Now we’ll explore how both the (ordinary) Monte Carlo and importance sampling estimators improve as the

sample size, n increases.

ns = seq(50,10000,len=995)

is <- rep(NA, 995)

mc <- rep(NA,995)

for(i in 1:995){

is[i] <- importance.sampling(f,phi,g,random.g,ns[i]) # importance sampling estimator

mc[i] <- mean( rcauchy(ns[i])>2 ) # (ordinary) Monte Carlo estimator

}

plot(ns, mc, type='l', col='red', xlab = 'sample size n', ylab='Estimate of P(X>2)', main='Estimate of P(X>2) -- Importance sampling (blue), MC (red)')

lines(ns,is, type='l', col='blue')

0 2000 4000 6000 8000 10000

0.10 0.15 0.20

Estimate of P(X>2) −− Importance sampling (blue), MC (red)

sample size n

Estimate of P(X>2)

4

Exercises

1. Consider again estimating θ = P(X > 2), where X follows a Cauchy density.

• Estimate the standard error of ˆθn(f) and ˆθn(g), (the (ordinary) Monte Carlo and importance sampling

estimators, respectively, of θ) for various values of n. Provide appropriate plots and comments on your

output. [15 marks]

• Provide an approximate 95% confidence interval for θ using (ordinary) Monte Carlo for n = 1, 000

and similarly another approximate 95% confidence interval for θ using importance sampling, again for

n = 1, 000. [5 marks]

2. Here we would like to estimate θ = P(X > 3) where X ∼ N(0, 1). Estimate θ using importance

sampling with an importance density g(x) distributed as N(µ, 1), for µ > 0. You may use n = 10, 000.

Experiment using different values of µ and discuss how this effects the precision the estimate of θ.

[30 marks]

Hand-in date: October 23rd at 12pm

5


版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp