Content uploaded by Silvia Silva da Costa Botelho
Author content
All content in this area was uploaded by Silvia Silva da Costa Botelho on Jul 07, 2015
Content may be subject to copyright.
Geostatistics for Context-Aware Image
Classification
Felipe Codevilla1(B
), Silvia S.C. Botelho1, Nelson Duarte1, Samuel Purkis2,
A.S.M. Shihavuddin3, Rafael Garcia3, and Nuno Gracias3
1Center of Computational Sciences (C3), Federal University of Rio Grande (FURG),
Rio Grande, Brazil
2National Coral Reef Institute, Nova Southeastern University,
Dania Beach, FL 33004, USA
3Computer Vision and Robotics Institute, Centre d’Investigaci´o En Rob`otica
Submarina, Universitat de Girona, 17003 Girona, Spain
felipe.codevilla@furg.br
Abstract. Context information is fundamental for image understand-
ing. Many algorithms add context information by including semantic
relations among objects such as neighboring tendencies, relative sizes
and positions. To achieve context inclusion, popular context-aware clas-
sification methods rely on probabilistic graphical models such as Markov
Random Fields (MRF) or Conditional Random Fields (CRF). However,
recent studies showed that MRF/CRF approaches do not perform better
than a simple smoothing on the labeling results.
The need for more context awareness has motivated the use of differ-
ent methods where the semantic relations between objects are further
enforced. With this, we found that on particular application scenarios
where some specific assumptions can be made, the use of context rela-
tionships is greatly more effective.
We propose a new method, called GeoSim, to compute the labels
of mosaic images with context label agreement. Our method trains a
transition probability model to enforce properties such as class size and
proportions. The method draws inspiration from Geostatistics, usually
used to model spatial uncertainties. We tested the proposed method in
two different ocean seabed classification context, obtaining state-of-art
results.
Keywords: Context adding ·Underwater vision ·Geostatistics ·Con-
ditional random fields
1 Introduction
The idea of context information is fundamental for image classification and
object recognition [4,9]. The contextual information is usually associated with
some object relations that are fundamental for object identification. These, called
Semantic Relations [9], are associated with object distance probability (objects
c
Springer International Publishing Switzerland 2015
L. Nalpantidis et al. (Eds.): ICVS 2015, LNCS 9163, pp. 228–239, 2015.
DOI: 10.1007/978-3-319-20904-3 22
Geostatistics for Context-Aware Image Classification 229
tend to appear close to each other), size (objects usually have a predefined size)
and position (objects have usually a predefined position).
A large number of recently proposed methods aim to increase the classi-
fication accuracy by using context [5,6,16] and enforcing Semantic Relations.
In this area, the use of probabilistic graphical models such as Conditional Ran-
dom Fields (CRF) or Markov Random Fields (MRF) are common approaches
to include the spatial context.
However, Lucchi et al. [13] illustrated that simpler global features can pro-
duce results comparable to those achieved with complex CRF or MRF models.
The need to obtain better results than with a simple global feature, motivates
new methods to enforce the Semantic Relations more directly.
In this paper, we argue that for some important application scenarios, it is
easier to enforce Semantic Relations by exploring the particularities of those
scenarios. To demonstrate this, we present examples that utilize underwater
seabed mapping. These scenarios have useful properties, since they typically
tend to be acquired by a down-looking camera at constant distance to the scene,
thus having similar spatial statistics over the extent on the input images.
Considering these circumstances, we propose to use and adapt tools from the
Geostatistics field, to model spatial uncertainty. Geostatistics modeling tools are
commonly used in applications such as reservoir simulation in the hydrocarbon
industry [3], and for geologic mapping [14]. We combine the Geostatistics mech-
anisms with the modeling used by the CRF methods in order to successively
maximize context agreement on label assignment.
Our model considers spatial properties such as size, proportions and jux-
taposition tendencies by training a Markov chain transition probability model
between classes. These relations are iteratively added by randomly sampling
patches in different image neighborhoods until a convergence is obtained. Finally,
we test our method in two different ocean seabed classification images and are
able to obtain state-of-art results.
2 Problem Formulation
We assume an image to be represented as a vector X=(x1, ..., xn), where each
xiis a patch from the image. For a given image X, the objective is to obtain a
vector of labelled patches Y=(y1, ..., yn) where, each yican assume a k1...K
value.
One can consider the problem of assigning a label as a probability distribution
P(Y|X) where each label yihas a probability to be of each possible class from the
set of classes. With this, we are interested on finding the maximum a-posteriori
distribution (MAP), that is the optimal solution given a model, i.e. the set of
labels that has the maximum probability. This is showed by:
Y∗=arg max P(Y|X).(1)
The main difficulty for the MAP is the high complexity of modelling, i.e. to
learn the probability distribution, also leading to a high computational cost to
compute Y∗[17].
230 F. Codevilla et al.
An common solution to approximate P(X|Y) is by the CRF model that uses
a graph representation of the image X.LetG(υ, ε) be the representation of
the image, being υthe set of vertex and εthe set of edges. According to the
Hammersley-Clifford theorem [10] the probability P(X|Y) can be written as a
normalized exponential of a energy function E(X). The energy function of a the
graph Gis a function of an unitary factor (ϕu
i) and a local factor (ϕL
ij ). Hence,
we define the energy of a certain set of labels Y=(y1, ..., yn) given a model
graph Gand a set of parameters θuand θlas:
E(X|G, θu,θ
l)=wu
u
yiX
ϕu
i(yi,θ
u)+wl
(yi,yj)ε
ϕL
ij (yi,y
j,θ
l),(2)
where the θparameters are associated with the spatial likelihood training. The
optimal labelling assignment (MAP) of Eq. 2is related to minimizing the energy
function Y∗=arg min E (X|G, θu,θ
l). However, direct minimization is unfeasi-
ble due to the extremely large domain of E(X)[5].
Algorithmic approximations have been proposed to tackle this problem, such
as belief propagation (BP) [18], which computes an approximation of the MAP
for a given model graph. Each node patch label probability pi(yi) will be updated
by a message passing algorithm that takes into consideration both unary and
local probability distributions:
pi(yi)= 1
Zϕu
i(yi)
jN(i)
mji(yi),(3)
where mji(yi) are the messages from jto i,N(i) is the set of all the nodes
neighbouring i,andmji(yi) is computed as:
mij (yj)←
yi
ϕu
i(yi)ϕL
ij (yi,y
j)
kN(i)\j
mki(yi).(4)
The belief propagation passes messages until a convergence condition is
obtained.
3 Proposed Approach
We show that the same maximum a-posteriori problem of Eq. 1(MAP) can be
solved by the application of a Sequential Indicator Simulation (SIS) adapted from
Geostatistical reservoir simulation methods [7]. We call it the GeoSim technique.
To simulate the probability of a certain label y0in a patch x0, for a given
class kfrom the set K, a certain number Nof random sample positions xαare
computed around the position x0in a radius r. An iterative function is applied
on the image lattice Xuntil convergence is obtained:
P(y0=k|X)(n)=(P(y0=k|X)(n−1) )gp(1
Z
N
α=1
K
j=1
P(yα=j|X)(n−1)wjk,α)gl(5)
Geostatistics for Context-Aware Image Classification 231
where P(y0=k|X)(n)is the probability of a given patch x0to be labelled as
class kat a certain iteration n. The equation is divided into two parts: (1) the left
fraction, (P(y0=k|X)(n−1))gpis related to the previous probability distribution
and is weighted by the constant gp. (2) the right fraction (weighted by gl), is
related to the sampled positions used to add the spatial context information,
being P(yα=j|X)(n−1) the probability of a sampled position xα.Zis the
normalization factor.
The probability for the basis (n= 0) is the prior unary probability, obtained
by using a trained classifier:
P(y0=k|X)(0) =Pu(y0=k|X).(6)
The function wjk,α is a weighted transition function that is related to the
probability of patch xαof class jbeing at a distance h=d(x0,x
α) from a patch
x0of class k,so:
wjk =tjk(h)u(x0,x
α),jK,kK, (7)
where tjk(h) is the transition probability function for a pair of classes over a
distance h. An heuristic function u(x0,x
α) is used to weight the transition.
We computed u(x0,x
α) as a weighting function that assigns larger weights to
smaller distances d(x0,x
α). For some Geostatistics applications the weight is
related to the a estimated variogram [7].
As showed by [7], the transition matrix Tcan be modelled as a exponential
function of distance hφunder a direction φ.
T(hφ)=eRφhφ(8)
We assume that the transition function is the same regardless of the direction
φ, which is valid for the cases where the objects to be classified are approximately
isomorphic. Figure 1shows an example of the matrix Tcontaining the transition
function tjk(h), for each pair of classes, on a three classes example.
Tcan be estimated by measuring the matrix Rfrom a set of annotated
training data [7]. The rate of transition between a pair of classes jand kare a
measure proportional to the average size of the classes and to the frequency of
transition between the pairs. Where each element of R,rjk, is computed as:
rjk =fjk
Ljk=jfjk
.(9)
The Lj, in Geostatistics, is computed as the mean length of a certain class.
By considering the objects as isomorphic shapes, we consider the mean length as a
2D mean radius Lj. We approximate the shapes as circles and we define the shapes
by a single radius parameter. With this approximation, we compute the radius of
all circles and subsequently compute the mean radius Land variance V.
The transition frequency fjk in Eq. 9is computed as the number of times of
the class joccurs and class koccur with a tolerance radius determined by the
mean radius Ljand Lk, and the variances Vjand Vk.
232 F. Codevilla et al.
Fig. 1. Example of transition probabilities for a 3 classes case. Each plot shows the
transition probabilities (yaxis) as function of the distance (xaxis, in pixels) from each
class (A,B,C) to all classes.
The transition matrix Tis computed from Rusing eigenvalue analysis [1],
as:
T(h)=eRh =
K
i=1
eλkhZk,(10)
where λkand Zkdenote the eigenvalues and the spectral component matri-
ces of R. The spectral components matrix can be directly computed from the
eigenvalues, as
Zk=m=k(λkI−R)
m=k(λm−λk)k=1, ..., K. (11)
3.1 GeoSim Algorithm
Given a trained transition probability matrix Tand a image with the prior unary
probabilities already computed, the following algorithm computes Eq. 5.
While error()∀x X > thresh increase n:
– Select a position x0pseudo-randomly, with preference on selecting positions
where max(P(y0|X)(n−1)) is smaller;
– Sample Npositions xαaround x0in radius r;
– Obtain the wjk for each pair of positions x0,x
α, following Eq. 7;
– Compute the P(y0=k|X)(n)for each possible class kusing Eq. 5;
Finally, the algorithm selects the class kwith maximum probability for
the function P(y=k|X) for each position xi. The error is monotonically
reduced through each additional iteration. However, the convergence rate is
greatly dependant on the weights gpand gl. The larger the magnitude of gp,
less the spatial information considered by each iteration and therefore the faster
the convergence of the solution.
Figure 2illustrates the algorithm sequence. The uncertainty, represented by
color intensity is reduced over the iterations reaching into a final classification
(Fig. 2c).
Geostatistics for Context-Aware Image Classification 233
3.2 Quenching Step
The entire results are based on the initial configuration of probability Pu(Y|X)
(Eq. 6). If there are considerable tendencies on a patch being assigned to a wrong
class, this tendency can propagate on the map culminating into a decrease of
accuracy.
AQuenching step is employed to minimize the image disagreement with the
measured Land proportions i.e. the prior probability of a class appearing on
the training set [8].
In a conceptually similar way to simulated annealing algorithms, we induce
perturbations after some iterations, by changing some classes probabilities dis-
tributions. These perturbations are done mainly on classes for which the distri-
butions do not correlate to the previously trained mean radius and proportions.
Figure 2b shows a quenching iteration. For this case, the yellow class had, on
the basis of the training set, an average radius of 56 pixels and a proportion of
0.05 % of the scene. For the Fig. 2b, the proportions and average size was about
of about 15 times higher. The changes were mainly done on this class, allowing
the GeoSim algorithm to propagate less information of this class.
3.3 Relation to Previous Works
The proposed method is closely related to the belief propagation method used
with CRF. The similarity in the methods stands from the fact that both propa-
gate context information based on prior assumptions and trained local relation-
ships. The two main differences between GeoSim and Belief Propagation are:
(1) the GeoSim method uses direct longer range interactions between nodes,
but with loosen connection properties, i.e. just Nsampled nodes are connected.
(2) The GeoSim approach also incorporates the idea that context comes not only
from between-objects but also from object size and scene proportions.
4 Experiments
4.1 Datasets
We tested the algorithms on two different seabed datasets. The datasets consist
of mosaics obtained by merging several hundred digital still images.
The Redsea dataset contains images captured in very shallow waters close to
the city of Eilat, as part of a survey of coral reef ecology [19]. For the classifi-
cation, we considered five classes: Urchin,Branching Coral,Brain Coral,Favid
Coral and the Background. We used one mosaic of 3256 ×2939 pixels for vali-
dation and the spatial likelihood training for both CRF and GeoSim. Another
mosaic of dimensions 3256 ×2939 pixels was used for testing. The mosaic was
created with a resolution of 1.1 pixel/mm.
The Marker dataset was captured in the Bahamas. We divided this set into
General Corals, Gorgonians (sea fans), Sand and the Background. We use one
mosaic of 2592 ×3963 for validation and the spatial likelihood training for both
CRF and GeoSim. Another mosaic of 2592 ×3963 pixels was used for testing.
The mosaic was captured in a 2.2 pixels/mm resolution.
234 F. Codevilla et al.
Fig. 2. Different iterations on the classification step for a five classes example (yellow,
green, blue, magenta and black). Each color represent a different class. The less mixed
is the color, the higher is the probability of a patch assuming a certain class, i.e. there
is a class kfor where P(yi=k|X)(n)is close to 1 for kand close to zero for all the
other classes. (a) Shows the initial configuration of the map where all positions xiare
equal to the unary probability distribution Pu(yi=k|X) . (b) Shows the map after
quenching step application, the yellow class was reduced since it did not attend to the
proportions and class size. (c) Shows the final result after convergence (Color figure
online).
4.2 Testing Configuration
The used test configuration to generate the unary probabilities is based on
the framework by Shihav et al. [15]. The underwater environment tends to pro-
duce color non-uniformity and subdues the overall contrast. For the cases where
the images are not severely degraded, like in our test data, the application of
simple methods for contrast correction and color correction are sufficient to facil-
itate the use of the texture and color features. The main selected features are
based on a mix of texture based features. We used a mix of Gabor filters, CLBP
and GLCM [15] feature kernels.
By using the set of textures descriptors , we compute the Pu(yi|X) based on
the confidence function training as explained by [2]. This is computed for each
patch, where each is a superpixel computed with the TurboPixels algorithm [12].
The confidence curve is computed in order to produce a more meaningful unary
probability distribution.
After having Pu(yi|X), the algorithm of Sect. 3.1 can be applied to compute
the the results. For the weights gpand glwe used respectively 1.5and1.
We compared our implementation with a CRF incremented with the Potts
potential [16].
4.3 Results
Figures 4and 3, shows the results for both the Redsea and the Marker datasets.
Each color represent a different class of seabed object being classified, black
begin the background.
Geostatistics for Context-Aware Image Classification 235
We can see the advantages of the more specific assumptions and richer sta-
tistics considerations of the GeoSim method as compared to the CRF approach.
This is seen specially for the Marker dataset (Figs. 3a,b,d,e) where the results
for GeoSim were about seven percentage points larger than CRF. As discussed in
Lucchi et. al. [13], the common probabilistic graphical models normally assure no
more than local smoothness. However, we can perceive for the GeoSim method
that the context measures such as sizes and proportions helped to improve the
(a) 0.5275 Unary (b) 0.6121 GeoSim
(c) 0.599 GeoSimQ (d) 0.5401 CRF
(e) Ground Truth
Fig. 3. The final results for the Marker seabed dataset. The figures show the data
classified by a color label. The dataset contains four different classes: Background
(black), Gorgonians (yellow), Sand (green) and General Corals (blue). In this case, we
can perceive a significant improvement of the GeoSim (in (b)) when compared to the
CRF model (in (d)) (Color figure online).
236 F. Codevilla et al.
(a) 0.75367 Unary (b) 0.76594 GeoSim
(c) 0.78449 GeoSimQ (d) 0.77697 CRF
(e) Ground Truth
Fig. 4. The final results for Redsea seabed dataset. The figures show the data classi-
fied by a color label. The dataset contains five different classes: Background (black),
Urchin(yellow), Branching Corals (magenta), Brain Corals (green) and Faviid Corals
(blue). We can see a better performance of the GeoSim method in (b) (Color figure
online).
Geostatistics for Context-Aware Image Classification 237
Table 1. Results of the average accuracy for multiple random samples of 1200 ×1200
pixels.
Algorithm Unary GeoSim GeoSimQ CRF
Accuracy RedSea 80.49 % 81.2%81.64%82.21%
Accuracy Marker 45.17 % 54.53%54.49%46.11 %
results. The green area on Fig. 3a was greatly reduced on the GeoSim results
(Fig. 3b). This happened since the green area did not respected the proportions
and sizes, allowing an increment on the classification accuracy. CRF, on the other
hand, in fact only enhanced the local smoothness, having marginal improvements
on accuracy (Fig. 3b). However, the Quenching step (GeoSimQ on Fig. 3c), was
not beneficial, since there was no large amount of patches with labels out of the
measured statistics with high probability for a single class.
For the Redsea dataset (Figs. 4a,c,d,e) there was less room for improve-
ment with context information, since the unary accuracy is higher. However,
the quenching step was able to improve the results. This happened since, the
perturbations reduced the yellow class size (Fig. 4c). This reduction is expected
because the Urchin class is usually small (This can be perceived at the Ground
Truth 4e).
In order to simulate more visual variability, we tested our algorithm with
multiple patches from the mosaics dataset. We cropped 15 randomly sampled
square patches of size 1200 ×1200 pixels and averaged their computed accuracy
for a different set of methods. The Table 1shows the results. We can still perceive
the best results for the GeoSim method over the CRF on the Marker dataset.
For the Redsea dataset, the differences were marginal. Again we perceived the
benefits of the quenching mainly for the Redsea dataset.
5 Conclusions
In this work we presented a novel method for context aware image classification
called GeoSim. The method is inspired by the techniques of spatial uncertainty
modelling. Our method is designed to work in a specific scenario that has low
scale variance. Examples of applications that have this properties are oceanic,
aerial and satellite mapping of natural scenes.
For these conditions, GeoSim was able to obtain significantly better results
than CRF, which was only able to enforce local smoothness. It contributes to
the field by being the first to combine techniques from two distinct topics, Geo-
statistics and Probabilistic Graphical Models, and to illustrate its benefit with
respect to the state-of-the-art.
As a future work, this method will be compared with more complex forms of
context addition such as the fully connected CRF [11] and auto-context [17].
238 F. Codevilla et al.
Acknowledgements. The authors would like to thank to the Brazilian National
Agency of Petroleum, Natural Gas and Biofuels(ANP), to the Funding Authority for
Studies and Projects(FINEP) and to Ministry of Science and Technology (MCT) for
their financial support through the Human Resources Program of ANP to the Petro-
leum and Gas Sector - PRH-ANP/MCT.
This paper is also a contribution of the Brazilian National Institute of Science and
Technology - INCT-Mar COI funded by CNPq Grant Number 610012/2011-8.
Additional support was granted by the Spanish National Project OMNIUS
(CTM2013-46718-R), and the Generalitat de Catalunya through the TECNIOspring
program (TECSPR14-1-0050) to N. Gracias.
References
1. Agterberg, F.: Mathematical geology. In: General Geology. Encyclopedia of
Earth Science, pp. 573–582. Springer, US (1988). http://dx.doi.org/10.1007/
0-387-30844-X 76
2. Aßfalg, J., Kriegel, H.-P., Pryakhin, A., Schubert, M.: Multi-represented classi-
fication based on confidence estimation. In: Zhou, Z.-H., Li, H., Yang, Q. (eds.)
PAKDD 2007. LNCS (LNAI), vol. 4426, pp. 23–34. Springer, Heidelberg (2007)
3. Beattie, C., Mills, B., Mayo, V.: Development drilling of the tawila field, yemen,
based on three-dimensional reservoir modeling and simulation. In: SPE Annual
Technical Conference, pp. 715–725 (1998)
4. Biederman, I., Mezzanotte, R.J., Rabinowitz, J.C.: Scene perception: detecting
and judging objects undergoing relational violations. Cogn. Psychol. 14(2), 143–
177 (1982)
5. Boix, X., Gonfaus, J.M., van de Weijer, J., Bagdanov, A.D., Serrat, J., Gonz`alez, J.:
Harmony potentials. Int. J. Comput. Vision 96(1), 83–102 (2012)
6. Carbonetto, P., de Freitas, N., Barnard, K.: A statistical model for general con-
textual object recognition. In: Pajdla, T., Matas, J.G. (eds.) ECCV 2004. LNCS,
vol. 3021, pp. 350–362. Springer, Heidelberg (2004)
7. Carle, S.F., Fogg, G.E.: Transition probability-based indicator geostatistics. Math.
Geol. 28(4), 453–476 (1996)
8. Deutsch, C.V., Journel, A.G., et al.: The application of simulated annealing to
stochastic reservoir modeling. SPE Adv. Technol. Ser. 2(02), 222–227 (1994)
9. Galleguillos, C., Belongie, S.: Context based object categorization: a critical survey.
Comput. Vis. Image Underst. 114(6), 712–722 (2010)
10. Grimmett, G.R.: A theorem about random fields. Bull. Lond. Math. Soc. 5(1),
81–84 (1973)
11. Kr¨ahenb¨uhl, P., Koltun, V.: Efficient inference in fully connected CRFS with
Gaussian edge potentials. In: Shawe-Taylor, J., Zemel, R.S., Bartlett, P.L.,
Pereira, F., Weinberger, K.Q. (eds.) Advances in Neural Information Processing
Systems, vol. 24, pp. 109–117. Curran Associates, Inc (2011)
12. Levinshtein, A., Stere, A., Kutulakos, K.N., Fleet, D.J., Dickinson, S.J.,
Siddiqi, K.: Turbopixels: fast superpixels using geometric flows. IEEE Trans. Pat-
tern Anal. Mach. Intell. 31(12), 2290–2297 (2009)
13. Lucchi, A., Li, Y., Boix, X., Smith, K., Fua, P.: Are spatial and global constraints
really necessary for segmentation? In: IEEE International Conference on Computer
Vision (ICCV), pp. 9–16. IEEE (2011)
Geostatistics for Context-Aware Image Classification 239
14. Purkis, S., Vlaswinkel, B., Gracias, N.: Vertical-to-lateral transitions among creta-
ceous carbonate facies: a means to 3-d framework construction via markov analysis.
J. Sediment. Res. 82(4), 232–243 (2012)
15. Shihavuddin, A., Gracias, N., Garcia, R., Gleason, A.C.R., Gintert, B.: Image-
based coral reef classification and thematic mapping. Remote Sens. 5(4), 1809–1841
(2013). http://www.mdpi.com/2072-4292/5/4/1809
16. Shotton, J., Winn, J., Rother, C., Criminisi, A.: Textonboost for image understand-
ing: multi-class object recognition and segmentation by jointly modeling texture,
layout, and context. Int. J. Comput. Vision 81(1), 2–23 (2009)
17. Tu, Z.: Auto-context and its application to high-level vision tasks. In: IEEE Con-
ference on Computer Vision and Pattern Recognition. CVPR 2008, pp. 1–8. IEEE
(2008)
18. Yedidia, J.S., Freeman, W.T., Weiss, Y., et al.: Generalized belief propagation.
In: Leen, T.K., Dietterich, T.G., Tresp, V. (eds.) Advances in Neural Information
Processing Systems, vol. 13, pp. 689–695. MIT Press (2001)
19. Zvuloni, A., Artzy-Randrup, Y., Stone, L., Kramarsky-Winter, E., Barkan, R.,
Loya, Y.: Spatio-temporal transmission patterns of black-band disease in a coral
community. PLoS One 4(4), e4993 (2009)