G: pp ex ookskeyword5187-Fotheringham5187-Fotheringham-Ch10 · [17:22 14/8/2008...

22
[17:22 14/8/2008 5187-Fotheringham-Ch10.tex] Paper: a4 Job No: 5187 Fotheringham: Spatial Analysis (Handbook) Page: 165 165–186 10 Spatial Sampling Eric Delmelle 10.1. INTRODUCTION When trying to make inferences about a phenomenon, we are forced to collect a limited number of samples instead of trying to acquire information at every possible location (see, e.g., Cochran, 1963; Dalton et al., 1975; Hedayat and Sinha, 1991; and Thompson 2002 for various summaries). A full inventory would yield a clear picture of the variability of the variable of interest, although this process is very time-consuming and expensive. Haining (2003) underlines that the cost of acquiring information on each individual may rule out a complete census. Sparse sampling on the other hand is cheap, but misses important features. However, there are instances where the level of precision may be the major motivation of the sampling process, especially when sampling remains relatively inexpensive. As a rule of thumb, it is generally desirable to have a higher concentration of samples where exhaustive and accurate information is needed, keeping in mind that the number of samples should always be as representative as possible of the entire population (Berry and Baker, 1968). When surveying a phenomenon character- ized by spatial variation, it is necessary to find optimal sample locations in the study area D. This problem is referred to spatial or two-dimensional sampling and has been applied to many disciplines such as mining, soil pollution, environmental monitoring, telecommunications, ecology, geology, and geography, to cite a few. Specific studies on spatial sampling can be found in Ripley (1981), Haining (2003), Cressie (1991), Stehman and Overton (1996) and Muller (1998). Spatial and non-spatial sampling strategies share common characteristics: 1 the size m of the set of samples; 2 the selection of a sample design, limited by the available budget; 3 an estimator (e.g., the mean) for the population characteristic; and 4 an estimation of the sampling variance to compute confidence intervals.

Transcript of G: pp ex ookskeyword5187-Fotheringham5187-Fotheringham-Ch10 · [17:22 14/8/2008...

[17:22 14/8/2008 5187-Fotheringham-Ch10.tex] Paper: a4 Job No: 5187 Fotheringham: Spatial Analysis (Handbook) Page: 165 165–186

10Spatial Sampling

E r i c D e l m e l l e

10.1. INTRODUCTION

When trying to make inferences about aphenomenon, we are forced to collect alimited number of samples instead of tryingto acquire information at every possiblelocation (see, e.g., Cochran, 1963; Daltonet al., 1975; Hedayat and Sinha, 1991; andThompson 2002 for various summaries).A full inventory would yield a clear pictureof the variability of the variable of interest,although this process is very time-consumingand expensive. Haining (2003) underlinesthat the cost of acquiring information oneach individual may rule out a completecensus. Sparse sampling on the other handis cheap, but misses important features.However, there are instances where the levelof precision may be the major motivationof the sampling process, especially whensampling remains relatively inexpensive. Asa rule of thumb, it is generally desirableto have a higher concentration of sampleswhere exhaustive and accurate information isneeded, keeping in mind that the number of

samples should always be as representative aspossible of the entire population (Berry andBaker, 1968).

When surveying a phenomenon character-ized by spatial variation, it is necessary tofind optimal sample locations in the studyarea D. This problem is referred to spatialor two-dimensional sampling and has beenapplied to many disciplines such as mining,soil pollution, environmental monitoring,telecommunications, ecology, geology, andgeography, to cite a few. Specific studieson spatial sampling can be found in Ripley(1981), Haining (2003), Cressie (1991),Stehman and Overton (1996) and Muller(1998). Spatial and non-spatial samplingstrategies share common characteristics:

1 the size m of the set of samples;2 the selection of a sample design, limited by the

available budget;3 an estimator (e.g., the mean) for the population

characteristic; and4 an estimation of the sampling variance to

compute confidence intervals.

texreader
AQ: Thompson (2002) not in refs
texreader
Thompson 2002

[17:22 14/8/2008 5187-Fotheringham-Ch10.tex] Paper: a4 Job No: 5187 Fotheringham: Spatial Analysis (Handbook) Page: 166 165–186

166 THE SAGE HANDBOOK OF SPATIAL ANALYSIS

Following Haining, spatial samplingchallenges can be divided into three differentcategories. The first pertains to problemsconcerned with estimating some non-spatialcharacteristics of a spatial population; forexample, the average income of householdsin a state. The second category deals withproblems where the spatial variation of avariable needs to be known, in the formof a map, or as a summary measure thathighlights scales of variation. The thirdcategory includes problems where theobjective is to obtain observations thatare independent of each other, allowingclassical statistical procedures to assist inclassifying data.

10.1.1. Spatial structure

A common objective in both spatial andnon-spatial approaches is to design asampling configuration that minimizes thevariance associated with the estimation. Inthis regard, the location of the samplesis very critical and depends heavily onthe structure of the variable. In non-spatialproblems, it may be crucial to stratifythe sampling scheme according to impor-tant underlying covariates. This holds forspatial phenomena as well. Unfortunately,this variation is often unknown, and anobjective is to design an optimal samplingarrangement, to obtain a maximum amountof information. If we undersample in someareas, the spatial variability will not becaptured. Oversampling on the other handcan result in redundant data. Consequently,not only the quantity of the samples isimportant, but also their locations. Thischapter is concerned primarily with thesecond category of sampling challenges, i.e.capturing the spatial structure of the primaryvariable.

10.1.2. Structure of the chapter

In this chapter, spatial samplingconfigurations are reviewed along with

their benefits and drawbacks. Second,the influence of geostatistics on samplingschemes is discussed. Sampling schemes canbe designed to capture the spatial variationof the variable of interest. Two commonobjectives therein are the estimation ofthe covariogram and the minimizationof the kriging variance. Third, methodsof adaptive sampling and second-phasesampling are presented. Such methodsare of nonlinear nature, and appropriateoptimization techniques are necessaryto solve such problems. Finally, salientsampling problems such as sampling in thepresence of multivariate information, andthe use of heuristics are discussed.

10.2. SPATIAL SAMPLINGCONFIGURATIONS

This section reviews significant samplingschemes for the purpose of two-dimensionalsampling. In the following subsections I willassume that a limited number of samplesm is collected within a study area denotedD. The variable of interest Z is sampledon m supports, generating observations{z(si) | i = 1, 2, . . . m }. For ease of illustra-tion, a square study area is used.

10.2.1. Major spatialsampling designs

Random samplingA simple random sampling scheme consistsof choosing randomly a set of m samplepoints in D, where each location in Dhas an equal probability of being sampled(Ripley, 1981). The selection of a unitdoes not influence the selection of anyother one (King, 1969). Figure 10.1(a)illustrates the random configuration. Thistype of design is also called uniform randomsampling since each point is chosen inde-pendently uniformly within D. Practically,two random numbers Ki and K ′

i are drawn

texreader
AQ: King (1969) publishing details
texreader
King,

[17:22 14/8/2008 5187-Fotheringham-Ch10.tex] Paper: a4 Job No: 5187 Fotheringham: Spatial Analysis (Handbook) Page: 167 165–186

SPATIAL SAMPLING 167

from the interval [0, 1]. Then the pointsi, defined by the pair {xi, yi} is selectedsuch that:

xi = KiL, yi = K ′i L, (10.1)

where L denotes the length of the studyarea D (Aubry, 2000). The process isrepeated m-times. According to Griffithand Amrhein (1997), the distribution ofthe points may not be representative ofthe underlying geographic surface, because

(a)

1

0.9

0.8

0.7

0.6

0.5y

0.4

0.3

0.2

0.1

00 0.1 0.2 0.3 0.4 0.5

x0.6 0.7 0.8 0.9 1

(b)

1

0.9

0.8

0.7

0.6

0.5y

0.4

0.3

0.2

0.1

00 0.1 0.2 0.3 0.4 0.5

x0.6 0.7 0.8 0.9 1

Figure 10.1 From left to right, top to bottom: random, centric systematic, systematicrandom, and systematic unaligned sampling schemes. Sampling size m = 100.

[17:22 14/8/2008 5187-Fotheringham-Ch10.tex] Paper: a4 Job No: 5187 Fotheringham: Spatial Analysis (Handbook) Page: 168 165–186

168 THE SAGE HANDBOOK OF SPATIAL ANALYSIS

(c)

1

0.9

0.8

0.7

0.6

0.5y

0.4

0.3

0.2

0.1

00 0.1 0.2 0.3 0.4 0.5

x0.6 0.7 0.8 0.9 1

(d)

1

ab

cj

l

k

fg

h

i

de

0.9

0.8

0.7

0.6

0.5y

0.4

0.3

0.2

0.1

00 0.1 0.2 0.3 0.4 0.5

x0.6 0.7 0.8 0.9 1

Figure 10.1 Continued

for most samples drawn, some areaswill be oversampled while other willbe undersampled. The advantages of thisdesign however reside in its operationalsimplicity, and its capacity to generate awide variety of distances among pairs ofpoints in D.

Systematic samplingThe population of interest is divided intom intervals of equal size. The first elementis randomly or purposively chosen withinthe first interval, starting at the origin.Depending on the location of the first sample,the remaining m − 1 elements are aligned

[17:22 14/8/2008 5187-Fotheringham-Ch10.tex] Paper: a4 Job No: 5187 Fotheringham: Spatial Analysis (Handbook) Page: 169 165–186

SPATIAL SAMPLING 169

regularly by the size of the interval �.If the first sample is chosen at random, theresulting scheme is called systematic randomsampling. When the first sample point is notchosen at random, the resulting configura-tion is called regular systematic sampling.A centric systematic sampling occurs whenthe first point is chosen in the centerof the first interval. The resulting schemeis a checkerboard configuration. The mostcommon regular geometric configurations arethe equilateral triangular grid, the rectangular(square) grid, and the hexagonal one (Cressie,1991). Practically, consider the case whereD is divided into a set of small, squarecells of size � = L/

√m. A first point

s1 = {x1, y1} is selected within the first cell inthe bottom left of D. The coordinates of s1 aresubsequently used to determine the followingpoint si = {xi, yi} (Aubry, 2000):

xi = x1 + (i − 1)�, yi = y1 + (j − 1)�

∀i, j = 1, . . .,√

m. (10.2)

To locate sample points along the x- andy-directions, it is imperative to have a desirednumber of samples m for which

√m must be

an integer value. The benefits of a systematicapproach reside in a good spreading ofobservations across D, guaranteeing a rep-resentative sampling coverage. Additionally,the spreading of the observations preventssampling clustering and redundancy. Thisdesign however presents two inconveniences:

1 the distribution of distances between points of Dis not sampled adequately because many pairs ofpoints are separated by the same distance; and

2 There is a danger that the spatial processshows evidence of recurring periodicities thatwill remain uncaptured, because the systematicdesign coincides in frequency with a regularpattern in the landscape (Griffith and Amrhein,1997; Overton and Stehman, 1993).

The second drawback can be lessenedconsiderably by use of a systematic randommethod that combines systematic and random

procedures (Dalton et al., 1975). One samplepoint is randomly selected within eachcell. However, sample density needs tobe high enough to have some clusteringof observations or the spatial relationshipbetween observations cannot be built. FromFigure 10.1(c), some patches of D remainundersampled, while others regions showevidence of clustered observations. A system-atic unaligned scheme prevents this problemfrom occurring by imposing a strongerrestriction on the random allocation ofobservations (King, 1969).

Stratified samplingAccording to Haining (2003), there arecases when local-area estimates are to beexamined, causing stratification to be builtinto the sampling strategy. In stratifiedsampling, the survey area (or D) is par-titioned into non-overlapping strata1. Foreach stratum, a set of samples is collected,where the sum of the samples over allstrata must equal m. The knowledge ofthe underlying process is a determiningfactor in defining the shape and size ofeach stratum. Some subregions of D mayexhibit stronger spatial variation, ultimatelyaffecting the configuration of each stra-tum (Cressie, 1991). Smaller strata arepreferred in non-homogeneous subregions.When points within each stratum are chosenrandomly, the resulting design is namedstratified random sampling. In Figure 10.2(a),six strata are sampled in proportion to theirsize. For instance, stratum A represents 30%of D, therefore if m = 100, 30 sample pointswill be allocated within A. Figure 10.2(b)illustrates the allocation of one sample perstratum (in casu the centroid), undersamplinglarger strata.

10.2.2. Efficiency of spatialsampling designs

The sampling efficiency is defined asthe inverse of the sampling variance.

[17:22 14/8/2008 5187-Fotheringham-Ch10.tex] Paper: a4 Job No: 5187 Fotheringham: Spatial Analysis (Handbook) Page: 170 165–186

170 THE SAGE HANDBOOK OF SPATIAL ANALYSIS

(a)

(b)

1C

B

E

DF

A

0.9

0.8

0.7

0.6

0.5y

0.4

0.3

0.2

0.1

00 0.1 0.2 0.3 0.4 0.5

x0.6 0.7 0.8 0.9 1

1C

B

E

D

F

A

0.9

0.8

0.7

0.6

0.5y

0.4

0.3

0.2

0.1

00 0.1 0.2 0.3 0.4 0.5

x0.6 0.7 0.8 0.9 1

Figure 10.2 Stratified sampling designs with six strata of different sizes (m = 6 on the rightfigure and m = 100 to the left).

According to Aubry (2000), the most efficientdesign leads to the most accurate estima-tion. Consider the estimation of the globalmean zD:

zD = 1

[D]

∫D

z(s) ds. (10.3)

It is desirable, from a statistical standpointto select a configuration that minimizes theprediction error of zD for a given estimator,for instance the arithmetic mean:

z = 1

m

m∑i=1

z(si). (10.4)

[17:22 14/8/2008 5187-Fotheringham-Ch10.tex] Paper: a4 Job No: 5187 Fotheringham: Spatial Analysis (Handbook) Page: 171 165–186

SPATIAL SAMPLING 171

Efficiency is calculated for all possible real-izations of the variable Z by Var ξ

[Z∗

D − ZD]

using σ 2k , which is the geostatistical pre-

diction error, defined later. In terms ofthe sampling variance, stratified randomsampling is at least always equally or moreaccurate than random sampling; its relativeefficiency is a monotone increasing functionof sample size.

Spatial autocorrelation. Ideally, the densityof sample points should increase in locationsexhibiting greater spatial variability. Valuesof closely spaced samples will show strongsimilarities and it may be redundant tooversample in those areas. The spatialautocorrelation function summarizes the sim-ilarity of the values of the variable of interestat different sample locations, as a functionof their separating distance (Gatrell, 1979;Griffith, 1987). Moran’s I (Moran, 1948,1950) is a measure of the degree of spatialautocorrelation among data points:

I = m

W

∑i, j w(sij)(z(si) − z)(z(sj) − z)∑

i (z(si) − z)2

(10.5)

with W defined as a weight matrix w(sij),m is the number of observations, the meanof the sampled values is denoted by zand z(si) is the measured attribute value atlocation si. The weight w(sij) is a measure ofspatial proximity between points siand sj; forexample:

w(sij) = exp(−βd(sij)2) (10.6)

where d(sij)2 is the squared distance betweenlocation si and point sj. Moran’s I value isnot implicitly constrained within the interval[−1, +1]. Spatial autocorrelation generallydecreases as the distance among samplepoints increases. A positive autocorrelationoccurs when values taken at nearby samplesare more alike than samples collected furtheraway. When the autocorrelation is a linear

decreasing function of distance, stratifiedrandom sampling has a smaller variancethan a systematic design (Quenouille, 1949).If the decrease in autocorrelation is not linear,yet concave upwards, systematic samplingis more accurate than stratified randomsampling, and a centered systematic design,where each point falls exactly in the middleof each interval, is more efficient than arandom systematic sampling configuration(Madow, 1953; Zubrzycki, 1958; Daleniuset al., 1960; Bellhouse, 1977; Iachan, 1985).

10.2.3. Other sampling designs

Nested or hierarchical samplingNested or hierarchical sampling designsrequire the study area D to be partitioned ran-domly into sample units (or blocks) creatingthe first level in the hierarchy, which is thenfurther subdivided into sample units nestedwithin level 1, and so forth (Haining, 2003).These units can be systematically or irreg-ularly arranged. As the process progresses,the distances between observations decreases(Corsten and Stein, 1994). One advantage ofa nested sampling design is that it allows formultiple scale analysis and supports quadratanalysis. Spatially nested sampling designsmay work well for geographic phenomenonthat are naturally clustered and for exploringmultiple scale effects. Hierarchical samplingis also possible at the discrete level. In suchcases, it is desirable to select at first randomlyone or more counties in a state. Then withinthese counties we might sample a number ofquadrats, or say, townships and finally, withinthe latter, randomly select some farmsteads(King, 1969).

In a multivariate case, dependent andindependent variables are hierarchicallyorganized and are thus not collected atthe same sampling frequency (Haining,2003). The primary variable may exhibitrapid change in spatial structure whilethe secondary variables are much morehomogeneous. A hierarchical samplingdesign captures such variation by collectingone variable at points nested within larger

[17:22 14/8/2008 5187-Fotheringham-Ch10.tex] Paper: a4 Job No: 5187 Fotheringham: Spatial Analysis (Handbook) Page: 172 165–186

172 THE SAGE HANDBOOK OF SPATIAL ANALYSIS

sampling units so it can be collected moreintensively than another variable.

Clustered samplingThis type of sampling consists of therandom selection of groups of sites wheresites are spatially close ‘within’ groups(Cressie, 1991). Clusters of observations aredrawn independently with equal probability.In a first stage, when the population isgrouped into clusters, the clusters are firstsampled (Haining, 2003). Either all of theobservations in the clusters are included, oronly a random selection from it. Clustersampling is essentially useful in a discretecase, when a complete list of the membersof a population cannot be obtained, yet acomplete list of groups (i.e., clusters) of thevariable is available. The method is alsouseful in saving sampling cost.

10.3. SAMPLING RANDOM FIELDSUSING GEOSTATISTICS

Most classical statistical sampling methodsmake no use of the spatial informationprovided by nearby samples. Geostatisticsdescribes the spatial continuity that is anessential feature of many natural phenomena.It can be seen as a collection of statisticalmethods, describing the spatial autocorre-lation among sample data. In geostatistics,multidimensional random fields are formal-ized and modeled as stochastic processes(see, e.g., Matérn, 1960; Whittle, 1963).In other words, the variable of interest ismodeled as a random process that can take aseries of outcome values, according to someprobability distribution (Goovaerts, 1997).Kriging is an interpolation technique thatestimates the value of the primary variableat unsampled locations (usually on a set Gof grid points

{sg

∣∣ g = 1, 2, . . ., G}, while

minimizing the prediction error. Using datavalues of Z , an empirical semivariogramγ (h) summarizing the variance of values

separated by a particular distance lag (h)is defined:

γ (h) = 1

2d(h)

∑|si−sj|=h

(z(si) − z(sj))2

(10.7)

where d(h) is the number of pairs ofpoints for a given lag value, and z(si) isthe measured attribute value at location si.The semivariogram is characterized by anugget effect a, and a sill σ 2 where γ (h)levels out. The nugget effect is the spatialdependence at micro scales, caused bymeasurement errors at distances smaller thanthe possible sampling distances (Cressie,1991). Once the lag distance exceeds acertain value r, called the range, there isno spatial dependence between the sam-ple sites. The variogram function γ (h)becomes constant at a value called thesill σ 2. A model γ (h) is fitted to theexperimental variogram (e.g., an exponentialmodel). With the presence of a nuggeteffect a:

γ (h) = a + (σ 2 − a)(1 − e−3h/r). (10.8)

The corresponding covariogram C(h) thatsummarizes the covariance between any twopoints is:

C(h) = C(0) − γ (h) = σ 2 − γ (h). (10.9)

The interpolated, kriged value at a locations in D is a weighted mean of surroundingvalues; each value is weighted according tothe covariogram model:

z(s) =I∑

i=1

wi(s)z(si) (10.10)

where I is the set of neighboring pointsthat are used to estimate the interpolated

[17:22 14/8/2008 5187-Fotheringham-Ch10.tex] Paper: a4 Job No: 5187 Fotheringham: Spatial Analysis (Handbook) Page: 173 165–186

SPATIAL SAMPLING 173

value at location s, and wi(s) is the weightassociated with each surrounding point.The optimization of spatial sampling ina geostatistical context first requires theestimation of a model to express the spatialdependence at different pairs of distances.This is summarized in the covariogramfunction. Secondly, such a model is then usedfor optimal interpolation of the variable understudy (Van Groenigen, 1997).

10.3.1. Optimal geometric designsfor covariogram estimation

To compute the most representative covari-ogram and to capture the main features ofspatial variability, a good spreading of samplepoints across the study area is necessary (VanGroenigen et al., 1999). In that context, sys-tematic sampling (Figure 10.1(b)) performswell. However, such a sampling design doesnot guarantee a wide range of separatingdistances (which is necessary to estimate thecovariogram), because:

1 distances are not evenly distributed; and2 there are few pairs of points at very small

distances to estimate the nugget effect.

A systematic random or systematicunaligned will generate a greater varietyof distance pairs. Another solution consistsof designing a sampling arrangement

where a subset of the m observations areevenly spread across the study area D andthe remaining points are somewhat moreclustered (Figure 10.3), to capture thecovariance at very small distances.

Sample size and sampleconfiguration issuesOptimizing the sampling configuration toestimate the parameters of the covariogram isnot an easy task. Webster and Oliver (1993)suggested that a total of at least m = 150samples over the study area is necessary.Moreover, the reliability of the covariogramis partly dependent on the number of pairs ofpoints available within each distance class.In this context, the Warrick/Myers (WM)criterion tries to reproduce an a priori definedideal distribution of pairs of points forestimating the covariogram. The procedureallows one to account for the variation indistance. Following Van Groenigen (1997),the WM-criterion is defined as:

Jw/m(S) = aK∑

i=1

wi(ξ∗i − ξi)

2+bK∑

i=1

σ (mi)

(10.11)

K∑i=1

ξ∗i = m(m − 1)

2(10.12)

(a) (b)

Figure 10.3 A systematic sampling scheme of m = 36 points in D is improved by theintroduction of n = 12 additional samples (•) clustered among the initial samples.

[17:22 14/8/2008 5187-Fotheringham-Ch10.tex] Paper: a4 Job No: 5187 Fotheringham: Spatial Analysis (Handbook) Page: 174 165–186

174 THE SAGE HANDBOOK OF SPATIAL ANALYSIS

where i denotes a given lag class of thecovariogram, K represents the total numberof classes, the parameters a, b, and wi

are user-defined weights. The term ξ∗i is

a prespecified number of point-pairs forthe ith class, ξi is the actual number ofdistances within that class, and σ (mi) is thestandard deviation from the median of thedistance lag class (Warrick and Myers, 1987).Equation (10.12) expresses the total numberof possible distance pairs, given the numberof samples. So for instance, when m = 4, sixpairs of points are generated.

Presence of anisotropyAnisotropy (as opposed to isotropy) is aproperty of a natural process, where theautocorrelation among points changes withdistance and direction between two locations.In other words, spatial variability is direction-dependent. Spatial variables may exhibitlinear continuity, such as in estimatingriparian habitat along rivers, aeolian deposits,and soil permeability along prevailing winddirections. We talk about an isotropic processhowever when there is no effect of directionin the spatial autocorrelation of the primaryvariable. It is generally desirable to aug-ment the sampling frequency in the angleof minimum continuity, since the spatialgradient of variation is maximum in thatdirection.

Impact of the nugget effectBogaert and Russo (1999) made an attempt tounderstand how the covariogram parametersare influenced by the choice of particularsampling locations. Their objective was tolimit the variability of the covariogramestimator. When the covariogram has nonugget effect, the benefits of the optimizationprocedure are somewhat diminished. In thepresence of a nugget effect, a randomsampling configuration will score poorly,because of the limited information offered byrandom sampling for small distances.

Using nested designsA nested design allows good estimation of thenugget effect at the origin. However, nestedsampling configurations produce inaccurateestimation of the covariogram in comparisonto random and systematic sampling. Thisoccurs due to the rather limited part of an areacovered by the sampling scheme, yieldinga high observation density in subregionsof the area, and a low observation densityfor other parts of the area. This in turngenerates only a few distances for whichcovariogram values are available. Nestedsampling designs are especially unsuitablewhen the observations collected accordingto such a design are used subsequentlyto estimate values at unvisited locations(Corsten and Stein, 1994).

10.3.2. Optimal designs tominimize the kriging variance

Kriging provides not only a least-squaresestimate of the attribute but also an attachederror variance (Isaaks and Srivastava, 1989),quantifying the prediction uncertainty at aparticular location in space. This uncertaintyis minimal, or zero when there is nonugget effect, at existing sampling pointsand increases with the distance to thenearest samples. A major objective consistsof designing a sampling configuration tominimize this uncertainty over the studyarea. This can be achieved when the covar-iogram, representing the spatial structureof the variable, is known a priori orhas been estimated. In this regard, optimalsampling strategies have been suggestedto reduce the prediction error associatedwith the interpolation process (Pettitt andMcBratney, 1993; Van Groenigen et al.,1999). Equation (10.13) formulates thekriging variance at a location s, where C−1

Mis the inverse of the covariance matrixCM based on the covariogram function(Bailey and Gatrell, 1995). M denotes theset of initial samples and has cardinality m.The term c is a column vector and cT

[17:22 14/8/2008 5187-Fotheringham-Ch10.tex] Paper: a4 Job No: 5187 Fotheringham: Spatial Analysis (Handbook) Page: 175 165–186

SPATIAL SAMPLING 175

the corresponding row vector, as given inEquation (10.15):

σ 2k (s) = σ 2 − cT (s) · C−1

M · c(s) (10.13)

CM =

σ 2 C1,2 . . . C1,m

C2,1 σ 2 · · · C2,m...

.... . .

Cm,1 Cm,2 · · · σ 2

(10.14)

c =

σ 2

C2,1...

Cm,1

, cT = [

σ 2 C1,2 . . . C1,m].

(10.15)

The total kriging variance TKV isobtained by integrating Equation (10.13)over D:

TKV =∫

Dσ 2

k (s)ds. (10.16)

Computationally, it is easier to discretize Dand sum the kriging variance over allgrid points sg. The average krigingvariance AKV over the study area isdefined as:

AKV =∑g∈G

σ 2k (sg). (10.17)

The only requirement to calculate thekriging variance is to have an initialcovariogram and the locations of the minitial sample points. It then dependssolely on the spatial dependence andconfiguration of the observations (Cressie,1991).

IllustrationSince continuous sampling is not feasible,it is necessary to discretize the area into

a set of potential points. Seeking the bestsampling procedure becomes a combinatorialproblem. Figure 10.4 illustrates the krigingvariance associated with a random samplingand a systematic random sampling from anexponential model. Darker areas denote ahigher interpolation uncertainty, which isincreasing away from existing points. Theestimation error is low at visited points.

Distance-based criteriaIt is possible to design sampling config-urations considering explicitly the spatialcorrelation of the variable (Arbia, 1994).What would you do if you were in a darkroom with candles? You would probablylight the first candle at a random locationor in the middle of the room. Then youwould find it convenient to light the secondcandle somewhere further away from thefirst. How far away will depend on theluminosity of the first candle. The strongerthe light, the further it can be locatedfrom the first candle. You would then lightthe third candle far away from the twofirst ones. Such an approach – known asDepending areal Units Sequential Technique(DUST) – is an infill sampling algorithm,and very suitable to locate points in mini-mizing the kriging variance over D. Anothermethod, known as the Minimization of theMean of the Shortest Distances (MMSD)requires all sampling points spread evenlyover the study area, ensuring that unvisitedlocations are never far from a samplingpoint. Both MMSD and DUST methodsassume:

1 prior knowledge of the spatial structure of thevariable; and

2 a stationary variable – an assumption violated inpractice.

Both criteria are purely deterministic, result-ing in spreading pairs of points evenly acrossthe study area, similar to the systematicconfiguration. Van Groenigen (1997) notesthat the area D is a continuous, infiniteplane. In reality, it is not physically possible

[17:22 14/8/2008 5187-Fotheringham-Ch10.tex] Paper: a4 Job No: 5187 Fotheringham: Spatial Analysis (Handbook) Page: 176 165–186

176 THE SAGE HANDBOOK OF SPATIAL ANALYSIS

(a)

(b)

1

0.9

0.8

0.7

0.6

0.5y

0.4

0.3

0.2

0.1

00 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

1

0.9

0.8

0.7

0.6

0.5y

0.4

0.3

0.2

0.1

00 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Figure 10.4 The kriging variance of a systematic random pattern (right figure) reduces thevalue of Equation 17 by 20% from a random pattern. Sample patterns are similar to the onesin Figure 10.1.

to sample everywhere; the presence ofspatial barriers such as roads, buildings ormountains restricts the sampling process andlimits the number and location of potentialpoints.

Impact of the nugget effectWhat is the influence of the nugget effectand sampling densities on the final sam-pling configuration? As the ratio nugget/sillincreases, a different sampling configuration

[17:22 14/8/2008 5187-Fotheringham-Ch10.tex] Paper: a4 Job No: 5187 Fotheringham: Spatial Analysis (Handbook) Page: 177 165–186

SPATIAL SAMPLING 177

is reached, placing more observations nearthe boundaries of the study area, becauseof the high variance at short distances.In that case, more samples are needed toobtain the same level of objective function(equation (10.17)) over D (Burgess et al.,1981). When the nugget effect is maximum(≈ sill), the covariogram is pure noise, andthe resulting optimal sampling scheme ispurely random, because no spatial correlationis present. At maximum sampling density, theestimation variance can never be less than thenugget effect. When the variance among pairsof points at very small distances (≈ nuggeteffect) is very high, a hexagonal design willperform best.

Presence of anisotropyWhich type of sampling design performsbetter in reducing the maximum krigingvariance, when anisotropy is present? Whenthe process is isotropic, a systematic equi-lateral triangle design will keep the varianceto a minimum, because it reduces thefarthest distance from initial sample pointsto points that are not visited. A squaregrid performs well, especially in the caseof isotropy (McBratney and Webster, 1981;McBratney et al., 1981). When anisotropyis present on the other hand, a squaregrid pattern is preferred to a hexagonalarrangement, although the improvement ismarginal (Olea, 1984).

Choice of a covariogram fitting modelDoes the choice of a covariogram fittingmodel affect the value of equation (10.17)?According to Van Groenigen (2000), an expo-nential model generates a point-symmetricsampling configuration that is identical to alinear model. However, the use of a Gaussianmodel tends to locate sample points veryclose to the boundary of D. This is explainedby the large kriging weights assigned to smalldistance values (parabolic behavior at theorigin).

10.3.3. Sampling reduction

Sampling density reduction of an existingspatial network is a problem related to sam-pling designs and is relevant in many regionsof the world where funding for environmentalmonitoring is decreasing. The process entailslowering the number of required samplesto reach an effective level of accuracy.Technically, it consists of selecting existingsamples from the original data set that will,in combination with a spatial interpolationalgorithm, produce the best possible estimateof the variable of interest, in comparisonwith the results obtained if all samplepoints were used (Olea, 1984). Usually, itis assumed that the residuals come from astationary process, and that the covariogramis linearly decreasing, with no nugget effect,and that the process is isotropic. In a studyaimed at predicting soil water contents,Ferreyra et al., (2002) developed a similarsampling density reduction method, from57 observations to 10 observations. With anoptimal arrangement of 10 samples, over70% of the predicted water contents had anerror within ±10%, showing that a similarlevel of confidence is reached with a limitednumber of samples.

10.4. SECOND-PHASE ANDADAPTIVE SAMPLING

When there is a need or desire to go outin the field to gather more information(i.e., additional samples) about the variable ofinterest, we talk about adaptive and second-phase sampling, depending on the studyobjective. In the following subsections, bothtechniques are discussed.

10.4.1. Adaptive sampling

Adaptive sampling finds its roots in theconcept of progressive sampling (Makarovic,1973). It provides an objective and automaticmethod for sampling, for example, terrain of

[17:22 14/8/2008 5187-Fotheringham-Ch10.tex] Paper: a4 Job No: 5187 Fotheringham: Spatial Analysis (Handbook) Page: 178 165–186

178 THE SAGE HANDBOOK OF SPATIAL ANALYSIS

varying complexity when sampling altitudevariation. As illustrated in Figure 10.5,progressive sampling involves a series ofsuccessive runs, beginning with a coarsesampling grid and then proceeding to grids ofhigher densities. The grid density is doubledon each successive sampling run and thepoints to be sampled are determined by acomputer analysis of the data obtained onthe preceding run. The analysis proceedsas follows: a square patch of nine pointson the coarsest grid is selected and theheight differences between each adjacent

pair of points along the rows and columnsare computed. The second differences arethen calculated. The latter carries informationon the terrain curvature. If the estimatedcurvature exceeds a certain threshold, itbecomes necessary on the next run to increasethe sampling density and sample points at thenext level of grid density.

A similar study was carried out byAyeni (1982) to determine the optimumnumber and spacing of terrain elevationdata points to produce a Digital ElevationModel (DEM). The importance of evaluating

(a)

25

20

20

25

1525

25

30

30

30

30

20

10

15

(b)

Figure 10.5 Initial systematic sampling of altitude is performed over the study in the topfigure. When a strong variation in elevation is encountered, the sampling density isincreased up to a desirable threshold is met.

[17:22 14/8/2008 5187-Fotheringham-Ch10.tex] Paper: a4 Job No: 5187 Fotheringham: Spatial Analysis (Handbook) Page: 179 165–186

SPATIAL SAMPLING 179

the adequate number of data points aswell as the appropriate sampling distributionof such points, that in turn constitute agood match to characterize a given terrain.Determining a sufficient number of pointsis not straightforward, since it depends onterrain roughness in relation to the sizeof the area occupied by the terrain. Theideas suggested in progressive sampling werelater carried over to the field of adaptivesampling (see Thompson and Seber, 1996).A major difference with conventional designslies in the selection of additional samplesin adaptive designs, because the locationof a new sample will depend upon thevalue of the points observed in the field.In other words, the procedure for selectingadditional samples depends on the outcomeof the variable of interest, as observedduring the survey of an initial samplingphase. The addition of a new samplecontributes to improve the confidence ofthe sampling distribution. Adaptive samplingis very efficient in the context of soilcontamination (Cox, 1999). How should arisk manager decide where to re-samplein order to maximize the information ofcontamination? In this particular contextit is generally recommended to sample inlocations above a particular threshold anddraw a fixed number of additional samplesaround them until subsequent measurementvalues are below a pre-specified contam-ination threshold. Figure 10.6 illustratesthe procedure of adaptive cluster sampling,where sample points represent measurementlocations of hypothetical contamination rates.On the left, contamination rates have beenmeasured at seven locations. A geographiclocation is said to be at risk (and needsremediation) when its value is above 0.7or at 70% of the contamination threshold.Call a property fathomed if samples havebeen taken from its immediate neighbors.A common choice is to define new neighborsof a contaminated zone to the North, South,East, and West: fathom each property onthe list by sampling and remove it fromthe risk list when it has been fathomed.In other words, the procedure re-samples

four neighboring locations of a contaminatedsite. Once a site shows a contamination rateunder the threshold value, it is fathomed.Otherwise, the procedure continues until atrigger condition is satisfied (e.g., a maximumnumber of additional samples is reached).This approach has some limitations however,because there is little rationale in takingadditional samples in areas where we knowthat the probability of exceeding a particularthreshold is maximal.

10.4.2. Second-phase sampling

In second-phase spatial sampling, a set M ofm initial measurements has been collected,and a covariogram C(h) has been calculated.In a second-phase, the scientist goes out tothe field to augment the set of observations,guided by the covariogram. The objectivefunction aims to collect new samples toreduce the kriging variance or uncertaintyby as much as possible. Equation (10.18)formulates the change in kriging variance�σ 2

k over all grid points sg, when a set N ofsize n containing new sample points is addedto our initial sample set M. The change �σ 2

kis the difference between the kriging variancecalculated with initial sample points and thekriging variance of the augmented set M ∪ Ncontaining [m + n] samples:

�σ 2k =

[TKVold − TKVnew

]

= 1

G

g∈G

σk,old2 (sg) −∑g∈G

σ 2k,new(sg)

(10.18)

σ 2k,old(sg) = σ 2 − c(sg)︸︷︷︸

[1,m]· C−1︸︷︷︸

[m]· cT (sg)︸ ︷︷ ︸

[m,1](10.19)

σ 2k,new(sg) = σ 2 − c(sg)︸︷︷︸

[1,m+n]· C−1︸︷︷︸[m+n]

· cT (sg)︸ ︷︷ ︸[m+n,1]

.

(10.20)

[17:22 14/8/2008 5187-Fotheringham-Ch10.tex] Paper: a4 Job No: 5187 Fotheringham: Spatial Analysis (Handbook) Page: 180 165–186

180 THE SAGE HANDBOOK OF SPATIAL ANALYSIS

(a)

. 4

. 4

. 4

. 69

. 71

. 76

. 78

. 75

. 78. 74 . 72

. 65

. 7

. 71

. 59

. 71. 73 . 89

. 89. 8

. 72

. 4

. 4

. 5

. 4

. 8

. 7. 5

(b)

Figure 10.6 The cluster adaptive sampling procedure, illustrated in the context of toxicwaste remediation. A site is fathomed (+) when its toxicity rate does not exceed thecontamination value.

The objective function (equation (10.21)) isto find the optimal set S∗ containing m + npoints that will maximize this change inkriging variance (Christakos and Olea, 1992;Van Groenigen et al., 1999), where S is aspecific sampling scheme:

MAX︸ ︷︷ ︸{sm+1, ...,sm+n}

J(S) = 1

G

∑g∈G

�σ 2k (sg; S ).

(10.21)

For simplicity, the continuous region D isusually approximated by a finite set P ofp points (Cressie, 1991). The set of newpoints is selected from the set of potential

points P. Hence, there is a total of

(pn

)possible sampling combinations and it istoo time-consuming to find the optimal setusing combinatorics. Figure 10.7 illustratesthe case where 50 sample points have beencollected in a first stage, leading to an

[17:22 14/8/2008 5187-Fotheringham-Ch10.tex] Paper: a4 Job No: 5187 Fotheringham: Spatial Analysis (Handbook) Page: 181 165–186

SPATIAL SAMPLING 181

(a)

(b)

Kriging variance

Change in Kriging variance

x 106

x 105

4.721

4.72

4.719

4.718

4.717

4.716

y

4.715

4.714

4.713

4.712

0.18

0.16

0.14

0.12

Impr

ovem

ent

0.1

0.08

0.06

0.04

0.02

00 1 2 3 4 5

Additional points

6 7 8 9 10

6.7 6.71 6.72 6.73 6.74 6.75x

6.76 6.77 6.78 6.79

Figure 10.7 An initial sampling network of m = 50 points (in white) has been augmentedwith the addition of n = 10 new samples (in blue). The figure to the right displays theimprovement.

[17:22 14/8/2008 5187-Fotheringham-Ch10.tex] Paper: a4 Job No: 5187 Fotheringham: Spatial Analysis (Handbook) Page: 182 165–186

182 THE SAGE HANDBOOK OF SPATIAL ANALYSIS

exponential covariogram, with the sequentialaddition of n = 10 new points and animprovement in the objective function ofnearly 20%.

Weighting the kriging variance? Thisuse of a weighting function w (•) for thekriging variance was originally suggestedby Cressie (1991) and has been appliedby Van Groenigen et al., (2000), Rogersonet al., (2004), and Delmelle (2005). Theimportance of a location to be sampled isrepresented by a weight w(s). The objectiveis to find the optimal sampling scheme S∗containing m + n points that will maximizethis change in weighted kriging variance.From equation (10.21):

MAX︸ ︷︷ ︸{sm+1,...,sm+n}

J(S) = 1

G

∑g∈G

w(sg)�σ 2k (sg; S).

(10.22)

In an effort to detect contaminated zones inthe Rotterdam harbor, Van Groenigen et al.,(2000) introduced the Weighted Means ofShortest Distance (WMSD) criterion, offer-ing a flexible way of using prior knowledgeon the variable under study. However, theweights do not reflect the spatial structureof the variable, but rather the scientist’sperception of the risks of contamination. Inthe first sampling phase, sampling weightsare assigned to sub-areas based on theirrisks for contamination. In the second phasehowever, a greater weight is assigned tolocations expected to exhibit a higher priorityfor remediation. Four weighting factors areconsidered with weights w = 1, 1.5, 2, and3, leading to more intensive sampling wherethe weight is higher. In a more recent study,Rogerson et al., (2004) have developed asecond-phase sampling technique, allowingre-sampling in areas where there is someuncertainty associated with a variable ofinterest, and hence not in areas wherethe probability of an event occurring isnear 0 or 1. A greedy algorithm was proposedto locate the points that would maximize thechange in weighted kriging variance.

Shortcoming of the use of the krigingvarianceMany authors have advocated the use of thekriging variance as a measure of uncertainty.It is unfortunately misused as a measure ofreliability of the kriging estimate, as noted byseveral authors (Deutsch and Journel, 1997;Armstrong, 1994). It is solely a functionof the sample pattern, sample density, thenumbers of samples and their covariancestructure. The kriging variance assumes thatthe errors are independent of each other.This means that the process is stationary,an assumption usually violated in practice.Stationarity entails that the variation of theprimary variable between two points remainssimilar at different locations in space, as longtheir separating distance remains unchanged.Figure 10.8 illustrates non-stationarity intwo dimensions (Armstrong, 1994). Theobjective in this particular example is tointerpolate the value of the inner gridpoint, highlighted with a question mark. Theinterpolation depends on the values of thefour surrounding points. Two scenarios arepresented. The one in b shows three verysimilar values and an extreme one. Theone in a however shows four values ina very narrow range. Assuming the spatialstructure is similar in both cases and since theconfiguration of the data points is the same,the kriging variances are identical. However,we have more confidence in the left-handside scenario since there is less variationamong the neighbors. This illustrates that theprediction error is not suitable for setting upconfidence intervals and should not be usedas an optimization criterion for additionalsampling strategies.

10.5. CURRENT RESEARCHDIRECTIONS

10.5.1. Incorporating multivariateinformation

Sample data can be very difficult to collect,and very expensive, especially in monitoringair or soil pollution for instance (Haining,

[17:22 14/8/2008 5187-Fotheringham-Ch10.tex] Paper: a4 Job No: 5187 Fotheringham: Spatial Analysis (Handbook) Page: 183 165–186

SPATIAL SAMPLING 183

(a)

11

12

8 9?

2

37

1 0?

(b)

Figure 10.8 Example of two-dimensional non-stationarity. Dark points are used as datavalues to interpolate the center point (light gray). After Armstrong (1994).

2003). Secondary data can be a valuableasset if they are available all over a studyarea and combined within the primaryvariable (Hengl et al., 2003). Secondaryspatial data sources include maps, national,socioeconomic, and demographic censusdata, but also data generated by publicsources (local and regional). This is veryvaluable and there had been a dramaticgrowth in the availability of secondary dataassociated with DEMs and satellites (forenvironmental data). Such secondary data iseasily integrated within a GIS framework(Haining, 2003). In multi-phase sampling,for instance, research has been confined tothe use of covariates in determining thelocations of initial measurements, wherebysample concentration is increased wherecovariates exhibit substantial spatial variation(Makarovic, 1973). Ideally, secondary vari-ables should be used to reduce the samplingeffort in areas where their local contributionin predicting the primary variable is maxi-mum (Delmelle, 2005). If a set of covariatespredicts accurately the data value whereno initial sample has been collected yet,there is little incentive to perform samplingat that location. On the other hand, whencovariates perform poorly in estimating theprimary variable, additional samples maybe necessary. The general issue pertains to

quantifying the spatial contribution given bycovariates.

10.5.2. Weighting the krigingvariance appropriately

Some current research has looked at waysto weight the kriging variance. Intuitively,one would like to sample at unvisitedlocations, far away from existing ones. Thisis accomplished using the kriging varianceas a sampling criterion. However, the spatialvariability of the primary variable is notaccounted for. It is recommended to weightthe kriging variance where the gradientof the primary variable is maximum, i.e.,where contour lines come close to oneanother, because there is a rapid changein the variable (Delmelle, 2005). It is alsodesirable to reduce sampling effort by usinginformation provided by auxiliary variables,when available.

10.5.3. The use of heuristics insampling optimization

In second-phase sampling, the set N ofadditional samples will be chosen from

[17:22 14/8/2008 5187-Fotheringham-Ch10.tex] Paper: a4 Job No: 5187 Fotheringham: Spatial Analysis (Handbook) Page: 184 165–186

184 THE SAGE HANDBOOK OF SPATIAL ANALYSIS

a set P of candidate sampling locations.This set is relatively large in practice, andhence the number of possible solutionsforbids an exhaustive search for the bestanswer (Michalewicz and Fogel, 2000).A total enumeration of all the solutions(≈ naive approach) is not possible, becauseof the combinatorial explosion. Suppose,for instance, that there is a potentialset of N = 900 points and that we arewilling to take n = 100 additional points

within that set. This generates

(900100

)900

100 ≈ 9.384 × 10134 solutions, and hencethe use of a naive approach is not rec-ommended (Goldberg, 1989; Grötschel andLovàsz, 1995). The search for an approximatesolution of complex problems is conductedusing a suitable heuristic method H. Theuse of a heuristic is necessary to assist inthe identification of an optimal sample setS∗ (or near optimal S+) ⊂ P. The heuristiccontrols a process that intends to solvethis optimization problem. The set S∗ isoptimal to the objective function J definedin equation (10.22). The efficiency of aheuristic depends on its capacity to giveas often as possible a solution S+ closeto S∗ (Grötschel and Lovàsz, 1995). Insecond-phase sampling, there are two dif-ferent ways of supplementing an initial set.Either n points are selected at one timeand added to the initial set all together,or one point at a time is added n-timesto the initial set. The former is definedas simultaneous addition and the latteris known as sequential addition and issuboptimal. Note that a hybrid approachthat would combine both techniques ispossible as well. In spatial sampling, limitedresearch has been devoted to comparing thebenefits and drawbacks of these heuristics.The greedy (or myopic) algorithm hasbeen used by Aspie and Barnes (1990),Christakos and Olea (1992) and Rogersonet al., (2004). Simulated annealing has beenapplied to spatial sampling problems inFerri and Piccioni (1992), Van Groenigenand Stein (1998), and Pardo-Igúzquiza(1998).

10.5.4. Spatio-temporalsampling issues

Spatial sampling optimization as discussed inthis chapter is based on the assumption ofstationarity of the variable itself over time(≈ no temporal variation). Variables such asrainfall, temperature, and snowfall vary overtime and it is not possible to take a secondset of samples to improve the prediction ofthese variables without affecting the stabilityof the model. Work in this context has beencarried out by Lajaunie et al., (1999).

NOTE

1 Note that a systematic sampling scheme is aspecial case of a stratified design in that the strataare all squares of equal size.

REFERENCES

Arbia, G. (1994). Selection techniques in samplingspatial units. Quaderni di Statistica e MatematicaApplicata Alle Scienze Economico-Sociali, 16:81–91.

Armstrong, M. (1994). Is research in mining geostatsas dead as dodo? In: Dimitrakopoulos R. (ed.).Geostatistics for the Next Century, pp. 303–312.Dordrecht: Kluwer Academic Publisher.

Aubry, P. (2000). Le Traitement des Variables Région-alisées en Ecologie: Apports de la Géomatique etde la Géostatistique}. Thèse de doctorat. UniversitéClaude Bernard – Lyon 1.

Aspie, D. and Barnes, R.J. (1990). Infill-sampling designand the cost of classification errors. MathematicalGeology, 22: 915–932.

Ayeni, O. (1982). Optimum sampling for digitalterrain models: A trend towards automation.Photogrammetric Engineering and Remote Sensing,48: 1687–1694.

Bailey, T.C. and Gatrell, A.C. (1995). Interactive SpatialData Analysis. Longman. 413p.

Bellhouse, D.R. (1977). Optimal designs for samplingin two dimensions. Biometrika, 64: 605–611.

Berry, B.J.L. and Baker, A.M. (1968). Geographicsampling. In: Berry B.J.L. and Marble D.F. (eds),

[17:22 14/8/2008 5187-Fotheringham-Ch10.tex] Paper: a4 Job No: 5187 Fotheringham: Spatial Analysis (Handbook) Page: 185 165–186

SPATIAL SAMPLING 185

Spatial Analysis: a Reader in Statistical Geography ;pp. 91–100. Prentice-Hall: Englewood Cliffs, N.J.

Bogaert, P. and Russo, D. (1999). Optimal samplingdesign for the estimation of the variogram based ona least squares approach. Water Resources Research,35(4): 1275–1289.

Burgess, T.M., Webster, R. and McBratney, A.B. (1981).Optimal interpolation and isarithmic mapping of soilproperties: IV. Sampling strategy. Journal of SoilScience, 32: 643–659.

Christakos, G. and Olea, R.A. (1992). Sampling designfor spatially distributed hydrogeologic and environ-mental processes. Advanced Water Resources, 15:219–237.

Cochran, W.G. (1963). Sampling Techniques. SecondEdition. New York: Wiley. 413p.

Corsten, L.C.A. and Stein, A. (1994). Nested samplingfor estimating spatial semivariograms compared toother designs. Applied Stochastic Models and DataAnalysis, 10: 103–122.

Cox, L.A. (1999). Adaptive spatial sampling ofcontaminated soil. Risk Analysis, 19: 1059–1069.

Cressie, N. (1991). Statistics for Spatial Data. New York:Wiley. 900p.

Dalenius, T., Hajek, J. and Zubrzycki, S. (1960). Onplane sampling and related geometrical problems.Proceedings of the Fourth Berkeley Symposium,1: 125–150.

Dalton, R., Garlick, J., Minshull, R. and Robinson, A.(1975). Sampling Techniques in Geography. London:Goerges Philip and Son Limited. 95p.

Delmelle, E.M. (2005). Optimization of Second-PhaseSpatial Sampling Using Auxiliary Information. Ph.D.Dissertation, Department of Geography, SUNY atBuffalo.

Deutsch, C.V. and Journel, A.G. (1997) Gslib:Geostatistical Software Library and User’s Guide. 2ndedition, 369p. Oxford University Press.

Ferreya, R.A., Apezteguía, H.P., Sereno, R. andJones, J.W. (2002). Reduction of soil water samplingdensity using scaled semivariograms and simulatedannealing. Geoderma, 110: 265–289.

Ferri, M. and Piccioni, M. (1992). Optimal selection ofstatistical units. Computational Statistics and DataAnalysis, 13: 47–61.

Goovaerts, P. (1997). Geostatistics for NaturalResources Evaluation. Oxford University Press. 483p.

Gatrell, A.C. (1979). Autocorrelation in spaces.Environmental and Planning A, 11: 507–516.

Goldberg, D.E. (1989). Genetic Algorithms in Search,Optimization, and Machine Learning. Reading, MA:Addison-Wesley.

Griffith, D. (1987). Spatial Autocorrelation: A Primer.Washington, DC: AAG.

Griffith, D. and Amrhein, C. (1997). MultivariateStatistical Analysis for Geographers. New Jersey:Prentice Hall. 345p.

Grötschel, M. and Lovàsz, L. (1995). Combinatorialoptimization. In: Graham R.L., Grötschel andLovàsz (eds), Handbook of Combinatorics, Vol. 2;pp. 1541–1579. Amsterdam, The Netherlands:Elsevier.

Haining, R.P. (2003). Spatial Data Analysis: Theory andPractice. Cambridge University Press. 452p.

Hedayat, A.S. and Sinha, B.K. (1991). Design andInference in Finite Population Sampling. New York:Wiley. 377p.

Hengl, T., Rossiter, D.G. and Stein, A. (2003).Soil sampling strategies for spatial prediction bycorrelation with auxiliary maps. Australian Journalof Soil Research, 41: 1403–1422.

Iachan, R. (1985). Plane sampling. Statistics andProbability Letters, 3: 151–159.

Isaaks, E.H. and Srivastava, R.M. (1989). An Intro-duction to Applied Geostatistics. New York: OxfordUniversity Press. 561p.

King, L.J. (1969). Statistical Analysis in Geography.288p.

Lajaunie, C., Wackernagel, H., Thiéry, L. andGrzebyk, M. (1999). Sampling multiphase noiseexposure time series. In: Soares A., Gomez-Hernandez J. and R. Froidevaux (eds), GeoENV II –Geostatistics for Environmental Applications ;pp. 101–112. Dordrecht: Kluwer AcademicPublishers.

Madow, W.G. (1953). On the theory of systematicsampling. III. Comparison of centered and randomstart systematic sampling. Annals of MathematicalStatistics, 24: 101–106.

Makarovic, B. (1973). Progressive sampling for digitalterrain models. ITC Journal, 15: 397–416.

Matérn, B. (1960). Spatial variation. Berlin: Springer-Verlag. 151p.

McBratney, A.B. and Webster, R. (1981). The design ofoptimal sampling schemes for local estimation and

[17:22 14/8/2008 5187-Fotheringham-Ch10.tex] Paper: a4 Job No: 5187 Fotheringham: Spatial Analysis (Handbook) Page: 186 165–186

186 THE SAGE HANDBOOK OF SPATIAL ANALYSIS

mapping of regionalized variables: II. Program andexamples. Computers and Geosciences, 7: 331–334.

McBratney, A.B., Webster, R. and Burgess, T.M. (1981).The design of optimal sampling schemes for localestimation and mapping of regionalized variables:I. Theory and method. Computers and Geosciences,7: 335–365.

McBratney, A.B. and Webster, R. (1986). Choosingfunctions for semi-variograms of soil properties andfitting them to sampling estimates. Journal of SoilScience, 37: 617–639.

Michalewicz, Z. and Fogel, D. (2000). How to Solve It:Modern Heuristics. Berlin: Springer. 467p.

Moran, P.A.P. (1948). The interpretation of statisticalmaps. Journal of the Royal Statistical Society, SeriesB, 10: 245–251.

Moran, P.A.P. (1950). Notes on continuous phenom-ena. Biometrika, 37: 17–23.

Muller, W. (1998). Collecting Spatial Data: Opti-mal Design of Experiments for Random Fields.Heidelberg: Physica-Verlag.

Olea, R.A. (1984). Sampling design optimizationfor spatial functions. Mathematical Geology, 16:369–392.

Oliver, M.A. and Webster, R. (1986). Combining nestedand linear sampling for determining the scale andform of spatial variation of regionalized variables.Geographical Analysis, 18: 227–242.

Overton, W.S. and Stehman, S.V. (1993). Propertiesof designs for sampling continuous spatial resourcesfrom a triangular grid. Communications in Statistics –Theory and Methods, 21: 2641–2660.

Pardo-Igúzquiza, E. (1998). Optimal; selection ofnumber and location of rainfall gauges forareal rainfall estimation using geostatistics andsimulated annealing. Journal of Hydrology, 210:206–220.

Pettitt, A.N. and McBratney, A.B. (1993). Samplingdesigns for estimating spatial variance components.Applied Statistics, 42: 185–209.

Quenouille, M.H. (1949). Problems in plane sampling.Annals of Mathematical Statistics, 20: 355–375.

Rogerson, P.A., Delmelle, E.M., Batta, R., Akella, M.R.,Blatt, A. and Wilson, G. (2004). Optimal sampling

design for variables with varying spatial importance.Geographical Analysis, 36: 177–194.

Ripley, B.D. (1981). Spatial statistics. New York:Wiley. 252p.

Stehman, S.V. and Overton, S.W. (1996). Spatialsampling. In: Arlinghaus, S. (ed.) Practical Handbookof Spatial Statistics ; pp. 31–64. Boca Raton, FL: CRCPress.

Thompson, S.K. and Seber, G.A.F. (1996). AdaptiveSampling. New York: Wiley. 288p.

Van Groenigen, J.W. (1997). Spatial simulatedannealing for optimizing sampling – differentoptimization criteria compared. In: Soares, A.,Gómez-Hernández, J. and Froidevaux, R. (eds).GeoENV I – Geostatistics for Environmental Appli-cations. Dordrecht: Kluwer Academic Publishers.

Van Groenigen, J.W. (2000). The influence of variogramparameters on optimal sampling schemes formapping by kriging. Geoderma, 97: 223–236.

Van Groenigen, J.W. and Stein, A. (1998). Constrainedoptimization of spatial sampling using continuoussimulated annealing. Journal of EnvironmentalQuality, 27: 1078–1086.

Van Groenigen, J.W., Siderius, W. and Stein, A.(1999). Constrained optimisation of soil sampling forminimization of the kriging variance. Geoderma, 87:239–259.

Van Groenigen, J.W., Pieters, G. and Stein, A.(2000). Optimizing spatial sampling for multivariatecontamination in urban areas. Environmetrics, 11:227–244.

Warrick, A.W. and Myers, D.E. (1987). Optimization ofsampling locations for variogram calculations. WaterResources Research, 23: 496–500.

Webster, R. and Oliver, M.A. (1993). How large asample is needed to estimate the regional variogramadequately. In: Soares, A. (ed.), Geostatistics Tróia’92; pp. 155–166. Dordrecht: Kluwer AcademicPublishers.

Whittle, P. (1963). Stochastic processes in severaldimensions. Bulletin of the International StatisticalInstitute, 40: 974–994.

Zubrzycki, S. (1958). Remarks on random, stratifiedand systematic sampling in a plane. ColloquiumMathematicum, 6: 251–264.