联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp

您当前位置:首页 >> Matlab编程Matlab编程

日期:2018-10-10 10:42

Geostatistics

1 Introduction

Geostatistical data are data that could, in principle, be measured anywhere within

a domain of interest. Examples of such data are:

– Gold grades within a gold mine.

– Particulate Matter (PM) in air samples.

Unlike in a point pattern, the observation locations themselves are not of primary

interest - these are usually fixed.

The interest is in aspects of the variable that have not been measured yet:

– maps of estimated values.

– maps of exceedance probabilities (i.e. regions where the probability of a particular

event is greater/less than a specified amount.)

– estimates of averages/aggregates over regions.

2 Random Fields and Stationarity

In geostatistics, data are assumed to be a partial realisation of a random field

{Z(s) : s ∈ D}

– D is a fixed subset of R

2

– The spatial index varies continuously throughout D.

– For a fixed s, Z(s) is a random variable.

– For a fixed realisation of this process, our observations are a function of space:

Z(s1), Z(s2), . . . , Z(sn).

Since we only have one observation of the random field, assuming stationarity of some

sort allows us to have some form of replication and hence we can perform estimation.

A random process Z(·) is second-order stationary if

1. E[Z(s)] = μ for all s ∈ D.

2. Cov(Z(si), Z(sj )) = C(si ? sj ) for all si

, sj ∈ D.

C(·) is called the covariance function. Note that by definition of what a covariance is,

V ar(Z(si)) = Cov(Z(si), Z(sj )) = C(0)

1

In our notation, we use h to denote the spatial lag h = si ? sj and h to denote the

distance ||h||.

If C(·) is only a function of the distance between si and sj (and not the direction), then

we say that the process is isotropic.

If C(·) is a function of the distance and direction between si and sj

, we say that the

process is anisotropic.

A random process is intrinsically stationary if

1. E[Z(s)] = μ for all s ∈ D.

2. V ar(Z(si) ? Z(sj )) = 2γ(si ? sj ) for all si

, sj ∈ D.

2γ(·) is known as the variogram, and γ(·) is known as the semivariogram. The

semivariogram shows how the dissimilarity between Z(si) and Z(si + h) varies with h.

The variogram is closely related to the covariance function. Both are functions of the

spatial lag h.

3 Semivariogram

3.1 Properties of the semivariogram

The semivariogram has the following properties:

– γ(?h) = γ(h)

– γ(0) = 0

– γ(h)/h2 → 0 as h → ∞. This implies that that the semivariogram can increase to

∞, but not in an uncontrolled fashion.

– γ is condionally non-negative definite. In other words, for any set of m real

numbers {ai} such that Pm

i=1 ai = 1, it holds that

Xm

i=1

Xm

j=1

aiajγ(si ? sj ) ≤ 0

– If the process is isotropic, γ(h) = γ(h).

A graph of the semivariogram plotted against separation h conveys information about

continuity and spatial variability of the process (see Figure 1).

If there is no spatial correlation between Z(si) and Z(sj ), the semivariogram will be a

horizontal line.

The shape of the semivariogram near the origin indicates the degree of smoothness of

the variable.

2

Figure 1: Semivariogram

– The shape reflects the covariance when si and sj are close together.

– A parabolic shape here means that it is a very smooth process.

– A linear shape indicates it is less smooth.

A vertical jump, or a non-zero nugget corresponds to high spatial irregularity.

– It means that two observations fairly close together can have very different values.

– A non-zero nugget is a discontinuity and could be due to measurement error.

3.2 Semivariogram and Covariance Function

If Z(·) is second-order stationary, then it can be shown that

γ(h) = C(0) C(h)

Further, if C(h) → 0 as h → ∞, then it also holds that γ(h) → C(0).

Note that C(0) is the variance of Z(s), and is also the sill of the semivariogram.

Since the two functions are closely related, why not just use the covariance function?

– The class of intrinsically stationary processes is larger than the class of second-order

stationary process (although only barely).

– Estimation of the semivariogram is more reliable than estimation of the covariance

function as the former does not require estimation of the mean.

3.3 Estimation of Semivariogram

Note that an alternative expression for the semivariogram can be obtained from the

following derviation:

3

2γ(h) = V ar(Z(s + h) ? Z(s))

= E[(Z(s + h) ? Z(s)) ? (μ ? μ)]2

= E[(Z(s + h) ? Z(s))2

]

Thus one way to estimate this function is to average all squared differences (Z(si) ?

Z(sj ))2

for pairs of observations taken the same lag apart, in the same direction.

γ?(h) = 1

2|N(h)|

X

N(h)

(Z(si) ? Z(sj ))2

– N(h) is the set of distinct pairs separated by lag h:

N(h) = {(si

, sj ) : si ? sj = h}

This is known as the classical semivariogram estimator or sample variogram. It can

provide point estimates of γ at observed values of h.

The most useful library for geostatistics is gstat.

library(sp)

library(gstat)

library(rgdal)

library(RColorBrewer)

library(classInt)

3.3.1 Empirical Semivariogram Options

The R function variogram uses certain defaults when computing the semivariogram

from data.

For instance, in time series, autocorrelations are seldom computed for lags farther than

half the series length. Similary, there is little point in computing semivariances for long

distances - it might be that N(h) contains only 1 or 2 points.

Another point to note is this: Suppose we wished to compute γ?(100). It is certainly

possible that we may not have many (or any) pairs of points that are exactly 100 units

apart.

To solve these issues, we can specify two arguments:

1. cutoff:

– the maximum distance up to which point pairs are considered.

– by default, R uses a value that is 1/3 times the largest diagonal of the bounding

box.

4

2. width:

– the tolerance to allow for the computation of γ?(h).

– for example, if we specify the width as 10, it means that when we wish to

estimate γ?(100), we consider all point pairs that are between 95 and 105 units

apart.

– by default, R considers a width of cutoff/15.

It is possible to estimate the variogram in different directions in order to investigate

anisotropy (see Figure 2).

Figure 2: Estimating Sample Semivariogram Options

As a rule of thumb, try to ensure that the semivariogram is computed at 10 to 20 lags,

and that there are at least 30 pairs of points at each of those lags.

3.3.2 Meuse Data set

Recall the meuse river bank data set, which contained measurements of the metal

concentration along the Meuse river in the Netherlands.

5

Figure 3 contains a plot of the log zinc measurements. It is clear that zinc concentration

is larger close to the river banks.

data(meuse)

coordinates(meuse) <- ~x+y

meuse$lzn <- log(meuse$zinc)

spplot(meuse, "lzn", main="Log-Zinc Measurements",

col.regions = heat.colors(4))

LogZinc Measurements

[4.727,5.285]

(5.285,5.843]

(5.843,6.401]

(6.401,6.959]

(6.959,7.517]

Figure 3: Meuse River Log-Zn

One way to check on the strength of spatial correlation at different lags is to make

scatter plots of pairs Z(si) and Z(sj ) grouped according to their separation distance

h = ||si  sj

||.

– These are called h-scatterplots.

– The main input are the distance classes to be used.

– The output of this code can be seen in Figure 4.

6

hscat(log(zinc) ~ 1, meuse, (0:9)*100)

lagged scatterplots

log(zinc)

log(zinc)

5.0

5.5

6.0

6.5

7.0

7.5 r = 0.722

(0,100]

5.0 5.5 6.0 6.5 7.0 7.5

r = 0.595

(100,200]

r = 0.392

(200,300]

r = 0.257

(300,400]

r = 0.18

(400,500]

5.0

5.5

6.0

6.5

7.0

r = 0.0962 7.5

(500,600]

5.0

5.5

6.0

6.5

7.0

7.5

5.0 5.5 6.0 6.5 7.0 7.5

r = 0.0177

(600,700]

r = ?0.0604

(700,800]

5.0 5.5 6.0 6.5 7.0 7.5

r = 0.0797

(800,900]

Figure 4: h-scatterplots

Finally, we can plot the sample semivariogram estimator for this dataset (Figure 5),

using default argument values.

vgm_log_zinc <- variogram(log(zinc) ~ 1, meuse)

plot(vgm_log_zinc)

Here is an example of how we can use R to investigate anisotropy (see Figure 6 for

output). It computes the sample variogram in 4 directions: N, NE, E and SE directions.

7

distance

semivariance

0.2

0.4

0.6

500 1000 1500

Figure 5: Sample semivariogram for Meuse

8

plot(variogram(log(zinc) ~ 1, meuse, alpha=c(0, 45, 90, 135)))

distance

semivariance

0.2

0.4

0.6

0.8

1.0

500 1000 1500

0 45

90

500 1000 1500

0.2

0.4

0.6

0.8

1.0

135

Figure 6: Investigating Anisotropy

3.3.3 Smoky Mountain pH

This dataset contains the pH measurements at several locations within the Great Smoky

Mountains. First, we shall make a plot of the data (Figure 7).

smoky_data <- read.table("../data/smoky.dat", header=TRUE, skip=1)

coordinates(smoky_data) <- ~easting + northing

smok <- readOGR("../data/smoky", "smokypoly")

## OGR data source with driver: ESRI Shapefile

## Source: "../data/smoky", layer: "smokypoly"

## with 1 features

## It has 1 fields

9

pal <- brewer.pal(4, "RdYlGn")

q4 <- classIntervals(smoky_data$ph, n=4, style="fisher")

spplot(smoky_data, "ph", cuts=q4$brks , col.regions=pal, colorkey=TRUE,

sp.layout=list("sp.polygons", smok), scales=list(draw=TRUE))

50 0 50 40 20 0 20 40

6.5

7.0

7.5

8.0

Figure 7: Smoky Mountain Data

Next, we can plot the sample variogram estimator (see Figure 8). The variogram object

will contain information on the exact values for h used, along with the number of points

used to compute the estimate there.

vgm_ph <- variogram(ph ~ 1, smoky_data, cutoff=120, width=10)

plot(vgm_ph)

head(vgm_ph, n=2)

## np dist gamma dir.hor dir.ver id

## 1 54 6.453213 0.06594352 0 0 var1

## 2 213 15.010015 0.12051268 0 0 var1

10

distance

semivariance

0.05

0.10

0.15

0.20

0.25

20 40 60 80 100

Figure 8: Smoky Mountain Semivariogram

11

3.4 Parametric Semivariogram Models (Isotropic)

The classical semivariogram estimator is a good first step to studying the data, but

unfortunately, when we use it for predictions, it is possible that it returns negative

estimates of variance. Hence we turn to parametric models that do not have this failing,

and try to fit one of them to our data.

We say that a semivariogram model is valid in dimension d if it satisfies the conditional

negative definite property. Here are some commonly used models that are fit to data.

1. Spherical

γ(h|θ) =

0 h = 0

c0 + cs[(3/2)(h/as) ? (1/2)(h/as)

3

] 0 < h ≤ as

c0 + cs h > as

valid in R, R

2 and R3

nearly linear at the origin

c0 is the nugget, cs is the partial sill and as is the range.

2. Exponential:

γ(h|θ) = (

0 , h = 0

c0 + ce[1 ? exp(?h/ae)] , h > 0

valid in R

d

for d ≥ 1.

c0 is the nugget and ce is the partial sill.

This semivariogram approaches the sill asymptotically.

The effective range is 3ae; this is the distance at which the autocorrelation = 0.05

3. Gaussian:

γ(h|θ) = (0 , h = 0

c0 + cg[1 exp((h/ag)2] , h > 0

valid in R

d

for d ≥ 1.

This function is parabolic near the origin.

Use this only if you have a lot of closely spaced data, which will allow you to

estimate the function near h = 0 accurately.

c0 is the nugget and cg is the partial sill.

The effective range is √

3ag.

4. Power:

γ(h|θ) = (

0 , h = 0

c0 + bhp

, h > 0

valid in R

d

for d ≥ 1.

These models do not have a sill nor a range.

These models imply that the spatial autocorrelation does not level off even for

large lags.

12

5. Stable:

γ(h|θ) = (

0 , h = 0

c0 + ct

[1 ? exp(?h/at)

α

] , h > 0

valid in R

d

for d ≥ 1.

They behave similar to the Power family near the origin, but these functions do

reach a sill.

These models do not have a sill nor a range.

c0 is the nugget, ct

is the partial sill and the effective range depends on at and α.

6. Matern:

γ(h|θ) = (

0 , h = 0

c0 + ck

for d ≥ 1.

c0 is the nugget, ck is the partial sill

The Gaussian semivariogram is obtained by letting α → ∞.

α determines the behaviour near the origin.

Note that the sum of two valid semivariograms is also a valid semivariogram.

A semivariogram that is valid in R

d2

is also valid in R

d1 when d2 > d1.

Figure 9 contains plots of several different theoretical semivariograms.

3.5 Fitting a Semivariogram

A semivariogram can be used for spatial prediction or simulation.

However, using values from the sample semivariogram for these purposes could yield

negative prediction errors.

Hence we typically fit one of the parametric models in the previous section, and then

use that.

For now, we continue to assume that our process is second-order stationary (constant

mean).

Semiariogram models in gstat are created using the vgm function.

– In gstat, semivariograms are defined through their partial sill and effective range.

– Some translation is necessary to match the variograms we encountered earlier to

those stored in gstat.

13

Figure 9: Theoretical Semivariograms

14

? We can get an overview of all the available semivariogram types with the following

commands (see Figure 8):

show.vgms(par.strip.text=list(cex=0.6))

distance

semivariance

vgm(1,"Nug",0)

0.0 1.0 2.0 3.0

vgm(1,"Exp",1) vgm(1,"Sph",1)

0.0 1.0 2.0 3.0

vgm(1,"Gau",1) vgm(1,"Exc",1)

0.0 1.0 2.0 3.0

vgm(1,"Mat",1)

vgm(1,"Ste",1) vgm(1,"Cir",1) vgm(1,"Lin",0) vgm(1,"Bes",1) vgm(1,"Pen",1

vgm(1,"Per",1)

0.0 1.0 2.0 3.0

vgm(1,"Wav",1) vgm(1,"Hol",1)

0.0 1.0 2.0 3.0

vgm(1,"Log",1) vgm(1,"Pow",1)

0.0 1.0 2.0 3.0

vgm(1,"Spl",1)

Figure 10: Show Semivariograms

In order to get a full list of variogram names used in gstat, we can use the following

command:

head(vgm())

## short long

## 1 Nug Nug (nugget)

## 2 Exp Exp (exponential)

## 3 Sph Sph (spherical)

## 4 Gau Gau (gaussian)

## 5 Exc Exclass (Exponential class/stable)

## 6 Mat Mat (Matern)

15

? Suppose we wish to create a Spherical variogram with parameters c0 = 1, cs = 2 and

as = 10. Then we would run the following (see Figure 9 for output):

sph1 <- vgm(psill=2, model='Sph', range=10, nugget=1)

class(sph1)

## [1] "variogramModel" "data.frame"

sph1

## model psill range

## 1 Nug 1 0

## 2 Sph 2 10

show.vgms(sill=2, range=10, models='Sph', nugget=1, max=13)

distance

semivariance

0.0

0.5

1.0

1.5

2.0

2.5

3.0

0 5 10

vgm(2,"Sph",10,nugget=1)

Figure 11: Show a particular semivariogram

16

? In order to create a Gaussian semivariogram with similar parameters c0 = 1, ce = 2

and effective range 3ae = 10. Then we would run the following:

gauss1 <- vgm(psill=2, model='Gau', range=10, nugget=1)

gauss1

## model psill range

## 1 Nug 1 0

## 2 Gau 2 10

Just remember that in each vgm model, there is

– a type, e.g. Nug, Sph, etc.

– a vertical component (psill)

– a horizontal component (range or effective range)

– Sometimes when a model has no sill or range, this parameter will represent other

arguments.

Semivariogram models can be added together. Consider adding the spherical and

gaussian semivariograms above, and then plotting them. See Figure 12 for the output.

gauss2 <- vgm(psill=2, model='Gau', range=10, nugget=0, add.to=sph1)

out <- variogramLine(gauss2, maxdist=15)

plot(out, type='l', main="Sum of Two Variograms")

In order to fit a semivariogram to the data:

1. Choose a suitable model.

2. Choose suitable initial values for the parameters

3. Fit the model using one of the fitting criteria.

3.5.1 Fitting Using Non-Linear Regression

? The aim is to find the closest theoretical model to the data according to the following

metric:

Xm

j=1

wj (γ(h|θ) ? γ?(h))2

The vector of parameters θ that minimises this value would correspond to the chosen

semivariogram. For instance, if we wished to fit a spherical model, we would need

to find the combination of c0, cs and as that minimises this weighted sum of squared

errors.

17

0 5 10 15

1 2

3

4

Sum of Two Variograms

dist

gamma

Figure 12: Result of Adding Two Semivariograms

18

? To be precise, the sample semivariogram values are not independent between lags (h).

The above formulation in fact is appropriate only when the errors are indpendent.

However, the correct formulation, using generalised least squares, is computationally

expensive.

However, it has been shown that using wj as Nj/h2

j provides a good compromise between

computational effort and accuracy. With these weights,

– lags with many observation pairs are given more weight.

– lags that have a large width are given less weight.

The value of the cost function at the minimising θ is referred to as the Weighted

Regresssion Sum of Squares (WRSS).

3.5.2 Fitting Using Likelihood Function

If we know that the data follows a multivariate Gaussian distribution, then we can use

the MLE to estimate the parameters.

gstat does not have this approach; only the geoR package has it.

Here is how a variogram model can be fit using the gstat package. The SSErr attribute

can be used to choose between competing theoretical variogram families.

fit_v <- fit.variogram(vgm_log_zinc, vgm(1, "Sph", 800, 1))

fit_v

## model psill range

## 1 Nug 0.05065923 0.0000

## 2 Sph 0.59060463 896.9976

attr(fit_v, "SSErr")

## [1] 9.011194e-06

plot(vgm_log_zinc, model=fit_v)

4 Spatial Prediction

Spatial prediction refers to the prediction of unknown quantities Z(s0), based on sample

data Z(si) and assumptions regarding the form of the trend of Z, its variance, and its

spatial correlation.

19

distance

semivariance

0.2

0.4

0.6

500 1000 1500

Figure 13: Fitting a Spherical Semivariogram

20

4.1 Inverse Distance Weighted (IDW) Interpolation.

IDW interpolation does not use a semivariogram. It computes a weighted average

Z(s0) =

Pn

i=1 wiZ(si)

Pn

i=1 wi

The weights are given by wi = ||si ? s0||?p

.

– points further away from s0 are given less importance (weight).

– p determines the degree to which nearer points are preferred.

As p increases, the contribution to the interpolated value from far-away points decreases.

Typical values of p are between 1 and 3.

Here are some disadvantages of IDW:

– What value of p should we use?

– It ignores spatial configuration and only uses distances.

– It cannot yield prediction errors.

– Weights are guaranteed to be between 0 and 1. This implies that interpolated

values are always within the range of observed values.

4.1.1 IDW using Meuse data set

Consider the Meuse data set. Refer to Figures 11 and 12 for the output.

data(meuse.grid)

coordinates(meuse.grid) <- ~x+y

meuse.grid <- as(meuse.grid, "SpatialPixelsDataFrame")

idw.out <- gstat::idw(log(zinc) ~ 1, meuse, meuse.grid, idp=2.5)

## [inverse distance weighted interpolation]

plot(idw.out, col=heat.colors(100))

head(as.data.frame(idw.out), n=2)

## x y var1.pred var1.var

## 1 181180 333740 6.390860 NA

## 2 181140 333700 6.561591 NA

Recall that the original point measurements appeared so:

21

5.0 5.5 6.0 6.5 7.0 7.5

Figure 14: IDW using Meuse Data set

22

spplot(meuse, "lzn", main="Log-Zinc Measurements", col.regions = heat.colors(4),

key.space="right")

Log?Zinc Measurements

[4.727,5.285]

(5.285,5.843]

(5.843,6.401]

(6.401,6.959]

(6.959,7.517]

Figure 15: Meuse River Log-Zn

4.2 Kriging

Kriging is a geostatistical technique for optimal spatial prediction.

There are many different types of kriging, depending on the underlying assumptions.

4.2.1 Ordinary Kriging (OK)

Let us consider a situation where:

– Z(·) is intrinsically stationary.

– The semivariogram γ(·) is known.

– E[Z(s)] = μ, where μ is unknown.

– We have observed data Z(s1), . . . , Z(sn) and wish to predict the value of Z(s0).

23

This situation is known as Ordinary Kriging (OK).

Assume that the OK predictor is a weighted average of the data:

Z?(s0) = Xn

i=1

λiZ(si)

? We have to solve for the λi

’s that give us some optimality:

– We want to ensure that our estimator is unbiased, i.e. that E[Z?(s0)] = μ.

– We want to minimise the Mean Squared Prediction Error (MSPE):

E[Z?(s0) ? Z(s0)]2

Using the method of Lagrange multipliers, the objective function is

E[(Xni=1

λiZ(si)Z(s0))2] 2m Xni=1

We need to find the λ1, λ2, . . . , λn, m that minimises the above expression.

In terms of the semivariogram, this can be rewritten as

Xn

i=1

Xn

j=1

λiλjγ(si sj ) + 2Xn

i=1

λiγ(s0 si) 2m

Xn

i=1

λi  1

!

We then take derivatives, set to 0 and find that we need to solve the following set of

equations:

Xn

j=1

λjγ(si ? sj ) + m = γ(s0 ? si), i = 1, . . . , N

Xn

i=1

λ1 = 1

If we define the following vectors, and Γ is an appropriate matrix,

λ0 = (λ1, λ2, . . . , λn, m)

T

γ0 = (γ(s0 ? s1), γ(s0 ? s2), . . . , γ(s0 ? sn), 1)T

then it is simpler to write the solution in matrix-form:

λ0 = Γ?1

0 γ0

24

? The kriging variance (or minimised MSPE) is

σ

2

(s0) = 2Xn

i=1

λiγ(s0 ? si)

Xn

i=1

Xn

j=1

λiλjγ(si ? sj )

=

Xn

i=1

λiγ(s0 ? si) + m

= λ

T

0 γ0

This is the prediction error associated with the prediction at location s0. Let’s take

a look at an example with the meuse data set (see Figure 16). Compare this plot to

Figure 14, which was created with the IDW method.

lz.ok <- krige(log(zinc)~1, meuse, meuse.grid, fit_v)

## [using ordinary kriging]

plot(lz.ok, col=heat.colors(100))

With kriging, we can also get an indication of the prediction error at each location.

Notice in Figure 17 that there are some regions that are very dark, indicating that

prediction errors at observed locations is in fact 0.

This is something we should be aware of. Kriging in the presence of a non-zero nugget

effect introduces a discontinuity in the prediction surface. Also notice how errors are

larger at regions that have fewer points.

plot(lz.ok['var1.var'])

plot(meuse, add=TRUE, col='grey50')

The kriging estimate is referred to as the BLUP or EBLUP (when the semivariogram

is estimated before plugging in).

4.2.2 Filtering

Kriging results in spatially smooth interpolators.

If there is a non-zero nugget, the predictions are discontinuous in the measurement

points.

25

5.0 5.5 6.0 6.5 7.0

Figure 16: Meuse Ordinary Kriging

26

0.1 0.2 0.3 0.4 0.5

Figure 17: Ordinary Kriging Variance

27

At measurement locations, kriging predicts the measurement value with a zero prediction

error variance. Any shift from this location will see a strongly different prediction.

In order to rectify this, if we believe the process is indeed smooth, we can conceive the

process Z(s) as a sum of an underlying smooth process U(s) and a measurement error

(s). We then focus on kriging U(s).

Here is how to instruct R that the nugget should be treated as pure measurement error.

err.fit <- fit.variogram(vgm_log_zinc, vgm(1, "Sph", 800, Err=1))

krige(log(zinc) ~ 1, meuse, meuse[1,], err.fit)

## [using ordinary kriging]

## coordinates var1.pred var1.var

## 1 (181072, 333611) 6.884405 0.03648707

krige(log(zinc) ~ 1, meuse, meuse[1,], fit_v)

## [using ordinary kriging]

## coordinates var1.pred var1.var

## 1 (181072, 333611) 6.929517 -1.110223e-16

4.2.3 An Aside: Indepdent errors, Weighted Least Squares and Generalised

Least Squares

It will be useful to get an overview of linear regression techniques before proceeding.

Suppose we have observed a vector of n responses, Y, a vector of length n.

In addition, suppose we have observed a set of p covariates for each response, which we

store in an n × (p + 1) matrix called X.

If our observations are independent, we assume the relation between X and Y takes

the form:

Y = Xβ + ε

where β is a vector of length p + 1, and ε is a multivariate Normal random vector with

mean 0 and variance σ

2

I.

It can be derived that the liklihood function is given by

l(β, σ2) = (2πσ2)n/2exp "

(Y  Xβ)

T(Y Xβ)

28

From here, we can derive the MLE of β to be

β = (XTX)

1XTY

This is exactly the same estimator as the Ordinary Least Squares (OLS) estimator.

In a spatial setting, the measurements at the n locations are not independent. We

cannot assume that the variance of ε is σ

2

I. Instead, we assume that

V ar(ε) = Σ = σ

2V

where Σ is a symmetric n × n matrix containing the covariances between all pairs of

locations.

The general idea is that the Xβ term in the model explains the large-scale variation,

while the correlations defined in Σ explain the small-scale variation. Suppose first that

we know what Σ is.

Then, based on the new likelihood function, the MLE of β is

βGLS = (XTV?1X)

1XTV1Y

This is known as the Generalised Least Squares estimate.

In general however, the matrix V is unknown to us. It contains n-choose-2 distinct

terms, which is a lot of parameters to be estimated.

However, if we represent the covariances with the semivariogram instead, then we only

need to estimate a smaller number of parameters, e.g. the sill, range, etc.

Thus, we now have two sets of vectors to estimate: β and θ since Σ is in fact a

function of θ. There are two common methods for estimating the unknown parameters:

likelihood-based ones, and iteratively re-weighted generalised least squares (IRWGLS).

For the likelihood based estimation, we have to numerically maximise it. This approach,

however, has not been implemented in gstat. We have to rely on the IRWGLS. It

consists of the following steps:

1. Obtain a starting estimate of β, say β?.

2. Compute the residuals r = Y ? Xβ?.

3. Estimate and model the semivariogram of the residuals using techniques described

earlier. This yields γ(h|θ), which in turn yields Σ(?θ).

4. Obtain a new estimate of β using

β = (XTΣ(θ)1X)1XTΣ(θ)1Y

5. Repeat steps 2 - 4 until the estimates of β and θ do not change much, e.g. change

is less than 10?5

.

29

4.2.4 Universal Kriging

Universal Kriging refers to linear prediction with nonstationary mean structure.

We assume that

Z(s) = X

p

j=0

Xj (s)β)j + e(s) = Xβ + e(s)

– The Xj (s) are known as spatial regressors

– βj are unknown regression coefficients, usually containing an intercept, i.e. X0(s) =

1.

– The Xj (s) could include the spatial coordinates as well.

– The residuals e(s) are to be modelled using a semivariogram.

Now suppose we wish to estimate Z(s0), and we know the semivariogram γ(·) for e(s).

Z(s0) = x(s0)β + vTΣ1

(Z(s Xβ)

– x(s0) are the predictor values for s0.

– Σ is the covariance matrix for Z(si) (obtained from γ(·).

– v is the covariance vector for Z(s0) and Z(si) (also from γ(·).

– Note that

β = (XTΣ1X)1X

TΣ1Z(s)

We can view the prediction as

– x(s0)β: estimated mean for location s0.

– vTΣ1

(Z(s) Xβ): weighted mean of residuals.

Prediction variance is given by

We can use the IRWGLS method in Section 4.2.2 to estimate the semivariogram.

Here, we use a subset of the meuse pixels to estimate the variogram in order to save

time. See Figure 18 for the output.

vt.fit2 <- fit.variogram.gls(log(zinc) ~ sqrt(dist), data=meuse[1:40,],

model= vgm(1, "Exp", 300, 1) ,trace=FALSE)

lz.uk.gls <- krige(log(zinc)~sqrt(dist), meuse, meuse.grid, vt.fit2)

## [using universal kriging]

30

plot(lz.uk.gls, col=heat.colors(100))

4.5 5.0 5.5 6.0 6.5 7.0

Figure 18: Universal Kriging

4.2.5 Block Kriging

No measurements can ever be taken on something of size 0. However, by default we

treat observations has coming from a point location. This is known as point kriging. In

contrast, block kriging predicts averages of larger areas or volumes.

This is a change of support problem. This is occurs when predictions are made for a

larger physical support based on small support observations.

Ordinary block kriging can be carried out for blocks of size 40 x 40 in the following

way:

lz.blk <- krige(log(zinc) ~ 1, meuse, meuse.grid, fit_v, block=c(40,40))

## [using ordinary kriging]

31

There are other ways in which the block can be specified:

1. For irregular but equal blocks, by specifying points that discretise the irregular

form.

2. For blocks of varying size, by passing an object of class SpatialPolygons to the

newdata argument.

Please refer to the textbook section 8.5.6 for examples on this.

4.2.6 Other Types of Kriging.

There are many variants of kriging. Some of them are shown in the diagram in Figure

19.

Figure 19: Kriging Flowchart


版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp