Geostatistics
1 Introduction
Geostatistical data are data that could, in principle, be measured anywhere within
a domain of interest. Examples of such data are:
– Gold grades within a gold mine.
– Particulate Matter (PM) in air samples.
Unlike in a point pattern, the observation locations themselves are not of primary
interest - these are usually fixed.
The interest is in aspects of the variable that have not been measured yet:
– maps of estimated values.
– maps of exceedance probabilities (i.e. regions where the probability of a particular
event is greater/less than a specified amount.)
– estimates of averages/aggregates over regions.
2 Random Fields and Stationarity
In geostatistics, data are assumed to be a partial realisation of a random field
{Z(s) : s ∈ D}
– D is a fixed subset of R
2
– The spatial index varies continuously throughout D.
– For a fixed s, Z(s) is a random variable.
– For a fixed realisation of this process, our observations are a function of space:
Z(s1), Z(s2), . . . , Z(sn).
Since we only have one observation of the random field, assuming stationarity of some
sort allows us to have some form of replication and hence we can perform estimation.
A random process Z(·) is second-order stationary if
1. E[Z(s)] = μ for all s ∈ D.
2. Cov(Z(si), Z(sj )) = C(si ? sj ) for all si
, sj ∈ D.
C(·) is called the covariance function. Note that by definition of what a covariance is,
V ar(Z(si)) = Cov(Z(si), Z(sj )) = C(0)
1
In our notation, we use h to denote the spatial lag h = si ? sj and h to denote the
distance ||h||.
If C(·) is only a function of the distance between si and sj (and not the direction), then
we say that the process is isotropic.
If C(·) is a function of the distance and direction between si and sj
, we say that the
process is anisotropic.
A random process is intrinsically stationary if
1. E[Z(s)] = μ for all s ∈ D.
2. V ar(Z(si) ? Z(sj )) = 2γ(si ? sj ) for all si
, sj ∈ D.
2γ(·) is known as the variogram, and γ(·) is known as the semivariogram. The
semivariogram shows how the dissimilarity between Z(si) and Z(si + h) varies with h.
The variogram is closely related to the covariance function. Both are functions of the
spatial lag h.
3 Semivariogram
3.1 Properties of the semivariogram
The semivariogram has the following properties:
– γ(?h) = γ(h)
– γ(0) = 0
– γ(h)/h2 → 0 as h → ∞. This implies that that the semivariogram can increase to
∞, but not in an uncontrolled fashion.
– γ is condionally non-negative definite. In other words, for any set of m real
numbers {ai} such that Pm
i=1 ai = 1, it holds that
Xm
i=1
Xm
j=1
aiajγ(si ? sj ) ≤ 0
– If the process is isotropic, γ(h) = γ(h).
A graph of the semivariogram plotted against separation h conveys information about
continuity and spatial variability of the process (see Figure 1).
If there is no spatial correlation between Z(si) and Z(sj ), the semivariogram will be a
horizontal line.
The shape of the semivariogram near the origin indicates the degree of smoothness of
the variable.
2
Figure 1: Semivariogram
– The shape reflects the covariance when si and sj are close together.
– A parabolic shape here means that it is a very smooth process.
– A linear shape indicates it is less smooth.
A vertical jump, or a non-zero nugget corresponds to high spatial irregularity.
– It means that two observations fairly close together can have very different values.
– A non-zero nugget is a discontinuity and could be due to measurement error.
3.2 Semivariogram and Covariance Function
If Z(·) is second-order stationary, then it can be shown that
γ(h) = C(0) C(h)
Further, if C(h) → 0 as h → ∞, then it also holds that γ(h) → C(0).
Note that C(0) is the variance of Z(s), and is also the sill of the semivariogram.
Since the two functions are closely related, why not just use the covariance function?
– The class of intrinsically stationary processes is larger than the class of second-order
stationary process (although only barely).
– Estimation of the semivariogram is more reliable than estimation of the covariance
function as the former does not require estimation of the mean.
3.3 Estimation of Semivariogram
Note that an alternative expression for the semivariogram can be obtained from the
following derviation:
3
2γ(h) = V ar(Z(s + h) ? Z(s))
= E[(Z(s + h) ? Z(s)) ? (μ ? μ)]2
= E[(Z(s + h) ? Z(s))2
]
Thus one way to estimate this function is to average all squared differences (Z(si) ?
Z(sj ))2
for pairs of observations taken the same lag apart, in the same direction.
γ?(h) = 1
2|N(h)|
X
N(h)
(Z(si) ? Z(sj ))2
– N(h) is the set of distinct pairs separated by lag h:
N(h) = {(si
, sj ) : si ? sj = h}
This is known as the classical semivariogram estimator or sample variogram. It can
provide point estimates of γ at observed values of h.
The most useful library for geostatistics is gstat.
library(sp)
library(gstat)
library(rgdal)
library(RColorBrewer)
library(classInt)
3.3.1 Empirical Semivariogram Options
The R function variogram uses certain defaults when computing the semivariogram
from data.
For instance, in time series, autocorrelations are seldom computed for lags farther than
half the series length. Similary, there is little point in computing semivariances for long
distances - it might be that N(h) contains only 1 or 2 points.
Another point to note is this: Suppose we wished to compute γ?(100). It is certainly
possible that we may not have many (or any) pairs of points that are exactly 100 units
apart.
To solve these issues, we can specify two arguments:
1. cutoff:
– the maximum distance up to which point pairs are considered.
– by default, R uses a value that is 1/3 times the largest diagonal of the bounding
box.
4
2. width:
– the tolerance to allow for the computation of γ?(h).
– for example, if we specify the width as 10, it means that when we wish to
estimate γ?(100), we consider all point pairs that are between 95 and 105 units
apart.
– by default, R considers a width of cutoff/15.
It is possible to estimate the variogram in different directions in order to investigate
anisotropy (see Figure 2).
Figure 2: Estimating Sample Semivariogram Options
As a rule of thumb, try to ensure that the semivariogram is computed at 10 to 20 lags,
and that there are at least 30 pairs of points at each of those lags.
3.3.2 Meuse Data set
Recall the meuse river bank data set, which contained measurements of the metal
concentration along the Meuse river in the Netherlands.
5
Figure 3 contains a plot of the log zinc measurements. It is clear that zinc concentration
is larger close to the river banks.
data(meuse)
coordinates(meuse) <- ~x+y
meuse$lzn <- log(meuse$zinc)
spplot(meuse, "lzn", main="Log-Zinc Measurements",
col.regions = heat.colors(4))
LogZinc Measurements
[4.727,5.285]
(5.285,5.843]
(5.843,6.401]
(6.401,6.959]
(6.959,7.517]
Figure 3: Meuse River Log-Zn
One way to check on the strength of spatial correlation at different lags is to make
scatter plots of pairs Z(si) and Z(sj ) grouped according to their separation distance
h = ||si sj
||.
– These are called h-scatterplots.
– The main input are the distance classes to be used.
– The output of this code can be seen in Figure 4.
6
hscat(log(zinc) ~ 1, meuse, (0:9)*100)
lagged scatterplots
log(zinc)
log(zinc)
5.0
5.5
6.0
6.5
7.0
7.5 r = 0.722
(0,100]
5.0 5.5 6.0 6.5 7.0 7.5
r = 0.595
(100,200]
r = 0.392
(200,300]
r = 0.257
(300,400]
r = 0.18
(400,500]
5.0
5.5
6.0
6.5
7.0
r = 0.0962 7.5
(500,600]
5.0
5.5
6.0
6.5
7.0
7.5
5.0 5.5 6.0 6.5 7.0 7.5
r = 0.0177
(600,700]
r = ?0.0604
(700,800]
5.0 5.5 6.0 6.5 7.0 7.5
r = 0.0797
(800,900]
Figure 4: h-scatterplots
Finally, we can plot the sample semivariogram estimator for this dataset (Figure 5),
using default argument values.
vgm_log_zinc <- variogram(log(zinc) ~ 1, meuse)
plot(vgm_log_zinc)
Here is an example of how we can use R to investigate anisotropy (see Figure 6 for
output). It computes the sample variogram in 4 directions: N, NE, E and SE directions.
7
distance
semivariance
0.2
0.4
0.6
500 1000 1500
Figure 5: Sample semivariogram for Meuse
8
plot(variogram(log(zinc) ~ 1, meuse, alpha=c(0, 45, 90, 135)))
distance
semivariance
0.2
0.4
0.6
0.8
1.0
500 1000 1500
0 45
90
500 1000 1500
0.2
0.4
0.6
0.8
1.0
135
Figure 6: Investigating Anisotropy
3.3.3 Smoky Mountain pH
This dataset contains the pH measurements at several locations within the Great Smoky
Mountains. First, we shall make a plot of the data (Figure 7).
smoky_data <- read.table("../data/smoky.dat", header=TRUE, skip=1)
coordinates(smoky_data) <- ~easting + northing
smok <- readOGR("../data/smoky", "smokypoly")
## OGR data source with driver: ESRI Shapefile
## Source: "../data/smoky", layer: "smokypoly"
## with 1 features
## It has 1 fields
9
pal <- brewer.pal(4, "RdYlGn")
q4 <- classIntervals(smoky_data$ph, n=4, style="fisher")
spplot(smoky_data, "ph", cuts=q4$brks , col.regions=pal, colorkey=TRUE,
sp.layout=list("sp.polygons", smok), scales=list(draw=TRUE))
50 0 50 40 20 0 20 40
6.5
7.0
7.5
8.0
Figure 7: Smoky Mountain Data
Next, we can plot the sample variogram estimator (see Figure 8). The variogram object
will contain information on the exact values for h used, along with the number of points
used to compute the estimate there.
vgm_ph <- variogram(ph ~ 1, smoky_data, cutoff=120, width=10)
plot(vgm_ph)
head(vgm_ph, n=2)
## np dist gamma dir.hor dir.ver id
## 1 54 6.453213 0.06594352 0 0 var1
## 2 213 15.010015 0.12051268 0 0 var1
10
distance
semivariance
0.05
0.10
0.15
0.20
0.25
20 40 60 80 100
Figure 8: Smoky Mountain Semivariogram
11
3.4 Parametric Semivariogram Models (Isotropic)
The classical semivariogram estimator is a good first step to studying the data, but
unfortunately, when we use it for predictions, it is possible that it returns negative
estimates of variance. Hence we turn to parametric models that do not have this failing,
and try to fit one of them to our data.
We say that a semivariogram model is valid in dimension d if it satisfies the conditional
negative definite property. Here are some commonly used models that are fit to data.
1. Spherical
γ(h|θ) =
0 h = 0
c0 + cs[(3/2)(h/as) ? (1/2)(h/as)
3
] 0 < h ≤ as
c0 + cs h > as
valid in R, R
2 and R3
nearly linear at the origin
c0 is the nugget, cs is the partial sill and as is the range.
2. Exponential:
γ(h|θ) = (
0 , h = 0
c0 + ce[1 ? exp(?h/ae)] , h > 0
valid in R
d
for d ≥ 1.
c0 is the nugget and ce is the partial sill.
This semivariogram approaches the sill asymptotically.
The effective range is 3ae; this is the distance at which the autocorrelation = 0.05
3. Gaussian:
γ(h|θ) = (0 , h = 0
c0 + cg[1 exp((h/ag)2] , h > 0
valid in R
d
for d ≥ 1.
This function is parabolic near the origin.
Use this only if you have a lot of closely spaced data, which will allow you to
estimate the function near h = 0 accurately.
c0 is the nugget and cg is the partial sill.
The effective range is √
3ag.
4. Power:
γ(h|θ) = (
0 , h = 0
c0 + bhp
, h > 0
valid in R
d
for d ≥ 1.
These models do not have a sill nor a range.
These models imply that the spatial autocorrelation does not level off even for
large lags.
12
5. Stable:
γ(h|θ) = (
0 , h = 0
c0 + ct
[1 ? exp(?h/at)
α
] , h > 0
valid in R
d
for d ≥ 1.
They behave similar to the Power family near the origin, but these functions do
reach a sill.
These models do not have a sill nor a range.
c0 is the nugget, ct
is the partial sill and the effective range depends on at and α.
6. Matern:
γ(h|θ) = (
0 , h = 0
c0 + ck
for d ≥ 1.
c0 is the nugget, ck is the partial sill
The Gaussian semivariogram is obtained by letting α → ∞.
α determines the behaviour near the origin.
Note that the sum of two valid semivariograms is also a valid semivariogram.
A semivariogram that is valid in R
d2
is also valid in R
d1 when d2 > d1.
Figure 9 contains plots of several different theoretical semivariograms.
3.5 Fitting a Semivariogram
A semivariogram can be used for spatial prediction or simulation.
However, using values from the sample semivariogram for these purposes could yield
negative prediction errors.
Hence we typically fit one of the parametric models in the previous section, and then
use that.
For now, we continue to assume that our process is second-order stationary (constant
mean).
Semiariogram models in gstat are created using the vgm function.
– In gstat, semivariograms are defined through their partial sill and effective range.
– Some translation is necessary to match the variograms we encountered earlier to
those stored in gstat.
13
Figure 9: Theoretical Semivariograms
14
? We can get an overview of all the available semivariogram types with the following
commands (see Figure 8):
show.vgms(par.strip.text=list(cex=0.6))
distance
semivariance
vgm(1,"Nug",0)
0.0 1.0 2.0 3.0
vgm(1,"Exp",1) vgm(1,"Sph",1)
0.0 1.0 2.0 3.0
vgm(1,"Gau",1) vgm(1,"Exc",1)
0.0 1.0 2.0 3.0
vgm(1,"Mat",1)
vgm(1,"Ste",1) vgm(1,"Cir",1) vgm(1,"Lin",0) vgm(1,"Bes",1) vgm(1,"Pen",1
vgm(1,"Per",1)
0.0 1.0 2.0 3.0
vgm(1,"Wav",1) vgm(1,"Hol",1)
0.0 1.0 2.0 3.0
vgm(1,"Log",1) vgm(1,"Pow",1)
0.0 1.0 2.0 3.0
vgm(1,"Spl",1)
Figure 10: Show Semivariograms
In order to get a full list of variogram names used in gstat, we can use the following
command:
head(vgm())
## short long
## 1 Nug Nug (nugget)
## 2 Exp Exp (exponential)
## 3 Sph Sph (spherical)
## 4 Gau Gau (gaussian)
## 5 Exc Exclass (Exponential class/stable)
## 6 Mat Mat (Matern)
15
? Suppose we wish to create a Spherical variogram with parameters c0 = 1, cs = 2 and
as = 10. Then we would run the following (see Figure 9 for output):
sph1 <- vgm(psill=2, model='Sph', range=10, nugget=1)
class(sph1)
## [1] "variogramModel" "data.frame"
sph1
## model psill range
## 1 Nug 1 0
## 2 Sph 2 10
show.vgms(sill=2, range=10, models='Sph', nugget=1, max=13)
distance
semivariance
0.0
0.5
1.0
1.5
2.0
2.5
3.0
0 5 10
vgm(2,"Sph",10,nugget=1)
Figure 11: Show a particular semivariogram
16
? In order to create a Gaussian semivariogram with similar parameters c0 = 1, ce = 2
and effective range 3ae = 10. Then we would run the following:
gauss1 <- vgm(psill=2, model='Gau', range=10, nugget=1)
gauss1
## model psill range
## 1 Nug 1 0
## 2 Gau 2 10
Just remember that in each vgm model, there is
– a type, e.g. Nug, Sph, etc.
– a vertical component (psill)
– a horizontal component (range or effective range)
– Sometimes when a model has no sill or range, this parameter will represent other
arguments.
Semivariogram models can be added together. Consider adding the spherical and
gaussian semivariograms above, and then plotting them. See Figure 12 for the output.
gauss2 <- vgm(psill=2, model='Gau', range=10, nugget=0, add.to=sph1)
out <- variogramLine(gauss2, maxdist=15)
plot(out, type='l', main="Sum of Two Variograms")
In order to fit a semivariogram to the data:
1. Choose a suitable model.
2. Choose suitable initial values for the parameters
3. Fit the model using one of the fitting criteria.
3.5.1 Fitting Using Non-Linear Regression
? The aim is to find the closest theoretical model to the data according to the following
metric:
Xm
j=1
wj (γ(h|θ) ? γ?(h))2
The vector of parameters θ that minimises this value would correspond to the chosen
semivariogram. For instance, if we wished to fit a spherical model, we would need
to find the combination of c0, cs and as that minimises this weighted sum of squared
errors.
17
0 5 10 15
1 2
3
4
Sum of Two Variograms
dist
gamma
Figure 12: Result of Adding Two Semivariograms
18
? To be precise, the sample semivariogram values are not independent between lags (h).
The above formulation in fact is appropriate only when the errors are indpendent.
However, the correct formulation, using generalised least squares, is computationally
expensive.
However, it has been shown that using wj as Nj/h2
j provides a good compromise between
computational effort and accuracy. With these weights,
– lags with many observation pairs are given more weight.
– lags that have a large width are given less weight.
The value of the cost function at the minimising θ is referred to as the Weighted
Regresssion Sum of Squares (WRSS).
3.5.2 Fitting Using Likelihood Function
If we know that the data follows a multivariate Gaussian distribution, then we can use
the MLE to estimate the parameters.
gstat does not have this approach; only the geoR package has it.
Here is how a variogram model can be fit using the gstat package. The SSErr attribute
can be used to choose between competing theoretical variogram families.
fit_v <- fit.variogram(vgm_log_zinc, vgm(1, "Sph", 800, 1))
fit_v
## model psill range
## 1 Nug 0.05065923 0.0000
## 2 Sph 0.59060463 896.9976
attr(fit_v, "SSErr")
## [1] 9.011194e-06
plot(vgm_log_zinc, model=fit_v)
4 Spatial Prediction
Spatial prediction refers to the prediction of unknown quantities Z(s0), based on sample
data Z(si) and assumptions regarding the form of the trend of Z, its variance, and its
spatial correlation.
19
distance
semivariance
0.2
0.4
0.6
500 1000 1500
Figure 13: Fitting a Spherical Semivariogram
20
4.1 Inverse Distance Weighted (IDW) Interpolation.
IDW interpolation does not use a semivariogram. It computes a weighted average
Z(s0) =
Pn
i=1 wiZ(si)
Pn
i=1 wi
The weights are given by wi = ||si ? s0||?p
.
– points further away from s0 are given less importance (weight).
– p determines the degree to which nearer points are preferred.
As p increases, the contribution to the interpolated value from far-away points decreases.
Typical values of p are between 1 and 3.
Here are some disadvantages of IDW:
– What value of p should we use?
– It ignores spatial configuration and only uses distances.
– It cannot yield prediction errors.
– Weights are guaranteed to be between 0 and 1. This implies that interpolated
values are always within the range of observed values.
4.1.1 IDW using Meuse data set
Consider the Meuse data set. Refer to Figures 11 and 12 for the output.
data(meuse.grid)
coordinates(meuse.grid) <- ~x+y
meuse.grid <- as(meuse.grid, "SpatialPixelsDataFrame")
idw.out <- gstat::idw(log(zinc) ~ 1, meuse, meuse.grid, idp=2.5)
## [inverse distance weighted interpolation]
plot(idw.out, col=heat.colors(100))
head(as.data.frame(idw.out), n=2)
## x y var1.pred var1.var
## 1 181180 333740 6.390860 NA
## 2 181140 333700 6.561591 NA
Recall that the original point measurements appeared so:
21
5.0 5.5 6.0 6.5 7.0 7.5
Figure 14: IDW using Meuse Data set
22
spplot(meuse, "lzn", main="Log-Zinc Measurements", col.regions = heat.colors(4),
key.space="right")
Log?Zinc Measurements
[4.727,5.285]
(5.285,5.843]
(5.843,6.401]
(6.401,6.959]
(6.959,7.517]
Figure 15: Meuse River Log-Zn
4.2 Kriging
Kriging is a geostatistical technique for optimal spatial prediction.
There are many different types of kriging, depending on the underlying assumptions.
4.2.1 Ordinary Kriging (OK)
Let us consider a situation where:
– Z(·) is intrinsically stationary.
– The semivariogram γ(·) is known.
– E[Z(s)] = μ, where μ is unknown.
– We have observed data Z(s1), . . . , Z(sn) and wish to predict the value of Z(s0).
23
This situation is known as Ordinary Kriging (OK).
Assume that the OK predictor is a weighted average of the data:
Z?(s0) = Xn
i=1
λiZ(si)
? We have to solve for the λi
’s that give us some optimality:
– We want to ensure that our estimator is unbiased, i.e. that E[Z?(s0)] = μ.
– We want to minimise the Mean Squared Prediction Error (MSPE):
E[Z?(s0) ? Z(s0)]2
Using the method of Lagrange multipliers, the objective function is
E[(Xni=1
λiZ(si)Z(s0))2] 2m Xni=1
We need to find the λ1, λ2, . . . , λn, m that minimises the above expression.
In terms of the semivariogram, this can be rewritten as
Xn
i=1
Xn
j=1
λiλjγ(si sj ) + 2Xn
i=1
λiγ(s0 si) 2m
Xn
i=1
λi 1
!
We then take derivatives, set to 0 and find that we need to solve the following set of
equations:
Xn
j=1
λjγ(si ? sj ) + m = γ(s0 ? si), i = 1, . . . , N
Xn
i=1
λ1 = 1
If we define the following vectors, and Γ is an appropriate matrix,
λ0 = (λ1, λ2, . . . , λn, m)
T
γ0 = (γ(s0 ? s1), γ(s0 ? s2), . . . , γ(s0 ? sn), 1)T
then it is simpler to write the solution in matrix-form:
λ0 = Γ?1
0 γ0
24
? The kriging variance (or minimised MSPE) is
σ
2
(s0) = 2Xn
i=1
λiγ(s0 ? si)
Xn
i=1
Xn
j=1
λiλjγ(si ? sj )
=
Xn
i=1
λiγ(s0 ? si) + m
= λ
T
0 γ0
This is the prediction error associated with the prediction at location s0. Let’s take
a look at an example with the meuse data set (see Figure 16). Compare this plot to
Figure 14, which was created with the IDW method.
lz.ok <- krige(log(zinc)~1, meuse, meuse.grid, fit_v)
## [using ordinary kriging]
plot(lz.ok, col=heat.colors(100))
With kriging, we can also get an indication of the prediction error at each location.
Notice in Figure 17 that there are some regions that are very dark, indicating that
prediction errors at observed locations is in fact 0.
This is something we should be aware of. Kriging in the presence of a non-zero nugget
effect introduces a discontinuity in the prediction surface. Also notice how errors are
larger at regions that have fewer points.
plot(lz.ok['var1.var'])
plot(meuse, add=TRUE, col='grey50')
The kriging estimate is referred to as the BLUP or EBLUP (when the semivariogram
is estimated before plugging in).
4.2.2 Filtering
Kriging results in spatially smooth interpolators.
If there is a non-zero nugget, the predictions are discontinuous in the measurement
points.
25
5.0 5.5 6.0 6.5 7.0
Figure 16: Meuse Ordinary Kriging
26
0.1 0.2 0.3 0.4 0.5
Figure 17: Ordinary Kriging Variance
27
At measurement locations, kriging predicts the measurement value with a zero prediction
error variance. Any shift from this location will see a strongly different prediction.
In order to rectify this, if we believe the process is indeed smooth, we can conceive the
process Z(s) as a sum of an underlying smooth process U(s) and a measurement error
(s). We then focus on kriging U(s).
Here is how to instruct R that the nugget should be treated as pure measurement error.
err.fit <- fit.variogram(vgm_log_zinc, vgm(1, "Sph", 800, Err=1))
krige(log(zinc) ~ 1, meuse, meuse[1,], err.fit)
## [using ordinary kriging]
## coordinates var1.pred var1.var
## 1 (181072, 333611) 6.884405 0.03648707
krige(log(zinc) ~ 1, meuse, meuse[1,], fit_v)
## [using ordinary kriging]
## coordinates var1.pred var1.var
## 1 (181072, 333611) 6.929517 -1.110223e-16
4.2.3 An Aside: Indepdent errors, Weighted Least Squares and Generalised
Least Squares
It will be useful to get an overview of linear regression techniques before proceeding.
Suppose we have observed a vector of n responses, Y, a vector of length n.
In addition, suppose we have observed a set of p covariates for each response, which we
store in an n × (p + 1) matrix called X.
If our observations are independent, we assume the relation between X and Y takes
the form:
Y = Xβ + ε
where β is a vector of length p + 1, and ε is a multivariate Normal random vector with
mean 0 and variance σ
2
I.
It can be derived that the liklihood function is given by
l(β, σ2) = (2πσ2)n/2exp "
(Y Xβ)
T(Y Xβ)
28
From here, we can derive the MLE of β to be
β = (XTX)
1XTY
This is exactly the same estimator as the Ordinary Least Squares (OLS) estimator.
In a spatial setting, the measurements at the n locations are not independent. We
cannot assume that the variance of ε is σ
2
I. Instead, we assume that
V ar(ε) = Σ = σ
2V
where Σ is a symmetric n × n matrix containing the covariances between all pairs of
locations.
The general idea is that the Xβ term in the model explains the large-scale variation,
while the correlations defined in Σ explain the small-scale variation. Suppose first that
we know what Σ is.
Then, based on the new likelihood function, the MLE of β is
βGLS = (XTV?1X)
1XTV1Y
This is known as the Generalised Least Squares estimate.
In general however, the matrix V is unknown to us. It contains n-choose-2 distinct
terms, which is a lot of parameters to be estimated.
However, if we represent the covariances with the semivariogram instead, then we only
need to estimate a smaller number of parameters, e.g. the sill, range, etc.
Thus, we now have two sets of vectors to estimate: β and θ since Σ is in fact a
function of θ. There are two common methods for estimating the unknown parameters:
likelihood-based ones, and iteratively re-weighted generalised least squares (IRWGLS).
For the likelihood based estimation, we have to numerically maximise it. This approach,
however, has not been implemented in gstat. We have to rely on the IRWGLS. It
consists of the following steps:
1. Obtain a starting estimate of β, say β?.
2. Compute the residuals r = Y ? Xβ?.
3. Estimate and model the semivariogram of the residuals using techniques described
earlier. This yields γ(h|θ), which in turn yields Σ(?θ).
4. Obtain a new estimate of β using
β = (XTΣ(θ)1X)1XTΣ(θ)1Y
5. Repeat steps 2 - 4 until the estimates of β and θ do not change much, e.g. change
is less than 10?5
.
29
4.2.4 Universal Kriging
Universal Kriging refers to linear prediction with nonstationary mean structure.
We assume that
Z(s) = X
p
j=0
Xj (s)β)j + e(s) = Xβ + e(s)
– The Xj (s) are known as spatial regressors
– βj are unknown regression coefficients, usually containing an intercept, i.e. X0(s) =
1.
– The Xj (s) could include the spatial coordinates as well.
– The residuals e(s) are to be modelled using a semivariogram.
Now suppose we wish to estimate Z(s0), and we know the semivariogram γ(·) for e(s).
Z(s0) = x(s0)β + vTΣ1
(Z(s Xβ)
– x(s0) are the predictor values for s0.
– Σ is the covariance matrix for Z(si) (obtained from γ(·).
– v is the covariance vector for Z(s0) and Z(si) (also from γ(·).
– Note that
β = (XTΣ1X)1X
TΣ1Z(s)
We can view the prediction as
– x(s0)β: estimated mean for location s0.
– vTΣ1
(Z(s) Xβ): weighted mean of residuals.
Prediction variance is given by
We can use the IRWGLS method in Section 4.2.2 to estimate the semivariogram.
Here, we use a subset of the meuse pixels to estimate the variogram in order to save
time. See Figure 18 for the output.
vt.fit2 <- fit.variogram.gls(log(zinc) ~ sqrt(dist), data=meuse[1:40,],
model= vgm(1, "Exp", 300, 1) ,trace=FALSE)
lz.uk.gls <- krige(log(zinc)~sqrt(dist), meuse, meuse.grid, vt.fit2)
## [using universal kriging]
30
plot(lz.uk.gls, col=heat.colors(100))
4.5 5.0 5.5 6.0 6.5 7.0
Figure 18: Universal Kriging
4.2.5 Block Kriging
No measurements can ever be taken on something of size 0. However, by default we
treat observations has coming from a point location. This is known as point kriging. In
contrast, block kriging predicts averages of larger areas or volumes.
This is a change of support problem. This is occurs when predictions are made for a
larger physical support based on small support observations.
Ordinary block kriging can be carried out for blocks of size 40 x 40 in the following
way:
lz.blk <- krige(log(zinc) ~ 1, meuse, meuse.grid, fit_v, block=c(40,40))
## [using ordinary kriging]
31
There are other ways in which the block can be specified:
1. For irregular but equal blocks, by specifying points that discretise the irregular
form.
2. For blocks of varying size, by passing an object of class SpatialPolygons to the
newdata argument.
Please refer to the textbook section 8.5.6 for examples on this.
4.2.6 Other Types of Kriging.
There are many variants of kriging. Some of them are shown in the diagram in Figure
19.
Figure 19: Kriging Flowchart
版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。