Conference PaperPDF Available

Abstract and Figures

Context information is fundamental for image understanding. Many algorithms add context information by including semantic relations among objects such as neighboring tendencies, relative sizes and positions. To achieve context inclusion, popular context-aware classification methods rely on probabilistic graphical models such as Markov Random Fields (MRF) or Conditional Random Fields (CRF). However, recent studies showed that MRF/CRF approaches do not perform better than a simple smoothing on the labeling results. The need for more context awareness has motivated the use of different methods where the semantic relations between objects are further enforced. With this, we found that on particular application scenarios where some specific assumptions can be made, the use of context relationships is greatly more effective. We propose a new method, called GeoSim, to compute the labels of mosaic images with context label agreement. Our method trains a transition probability model to enforce properties such as class size and proportions. The method draws inspiration from Geostatistics, usually used to model spatial uncertainties. We tested the proposed method in two different ocean seabed classification context, obtaining state-of-art results.
Content may be subject to copyright.
Geostatistics for Context-Aware Image
Classification
Felipe Codevilla1(B
), Silvia S.C. Botelho1, Nelson Duarte1, Samuel Purkis2,
A.S.M. Shihavuddin3, Rafael Garcia3, and Nuno Gracias3
1Center of Computational Sciences (C3), Federal University of Rio Grande (FURG),
Rio Grande, Brazil
2National Coral Reef Institute, Nova Southeastern University,
Dania Beach, FL 33004, USA
3Computer Vision and Robotics Institute, Centre d’Investigaci´o En Rob`otica
Submarina, Universitat de Girona, 17003 Girona, Spain
felipe.codevilla@furg.br
Abstract. Context information is fundamental for image understand-
ing. Many algorithms add context information by including semantic
relations among objects such as neighboring tendencies, relative sizes
and positions. To achieve context inclusion, popular context-aware clas-
sification methods rely on probabilistic graphical models such as Markov
Random Fields (MRF) or Conditional Random Fields (CRF). However,
recent studies showed that MRF/CRF approaches do not perform better
than a simple smoothing on the labeling results.
The need for more context awareness has motivated the use of differ-
ent methods where the semantic relations between objects are further
enforced. With this, we found that on particular application scenarios
where some specific assumptions can be made, the use of context rela-
tionships is greatly more effective.
We propose a new method, called GeoSim, to compute the labels
of mosaic images with context label agreement. Our method trains a
transition probability model to enforce properties such as class size and
proportions. The method draws inspiration from Geostatistics, usually
used to model spatial uncertainties. We tested the proposed method in
two different ocean seabed classification context, obtaining state-of-art
results.
Keywords: Context adding ·Underwater vision ·Geostatistics ·Con-
ditional random fields
1 Introduction
The idea of context information is fundamental for image classification and
object recognition [4,9]. The contextual information is usually associated with
some object relations that are fundamental for object identification. These, called
Semantic Relations [9], are associated with object distance probability (objects
c
Springer International Publishing Switzerland 2015
L. Nalpantidis et al. (Eds.): ICVS 2015, LNCS 9163, pp. 228–239, 2015.
DOI: 10.1007/978-3-319-20904-3 22
Geostatistics for Context-Aware Image Classification 229
tend to appear close to each other), size (objects usually have a predefined size)
and position (objects have usually a predefined position).
A large number of recently proposed methods aim to increase the classi-
fication accuracy by using context [5,6,16] and enforcing Semantic Relations.
In this area, the use of probabilistic graphical models such as Conditional Ran-
dom Fields (CRF) or Markov Random Fields (MRF) are common approaches
to include the spatial context.
However, Lucchi et al. [13] illustrated that simpler global features can pro-
duce results comparable to those achieved with complex CRF or MRF models.
The need to obtain better results than with a simple global feature, motivates
new methods to enforce the Semantic Relations more directly.
In this paper, we argue that for some important application scenarios, it is
easier to enforce Semantic Relations by exploring the particularities of those
scenarios. To demonstrate this, we present examples that utilize underwater
seabed mapping. These scenarios have useful properties, since they typically
tend to be acquired by a down-looking camera at constant distance to the scene,
thus having similar spatial statistics over the extent on the input images.
Considering these circumstances, we propose to use and adapt tools from the
Geostatistics field, to model spatial uncertainty. Geostatistics modeling tools are
commonly used in applications such as reservoir simulation in the hydrocarbon
industry [3], and for geologic mapping [14]. We combine the Geostatistics mech-
anisms with the modeling used by the CRF methods in order to successively
maximize context agreement on label assignment.
Our model considers spatial properties such as size, proportions and jux-
taposition tendencies by training a Markov chain transition probability model
between classes. These relations are iteratively added by randomly sampling
patches in different image neighborhoods until a convergence is obtained. Finally,
we test our method in two different ocean seabed classification images and are
able to obtain state-of-art results.
2 Problem Formulation
We assume an image to be represented as a vector X=(x1, ..., xn), where each
xiis a patch from the image. For a given image X, the objective is to obtain a
vector of labelled patches Y=(y1, ..., yn) where, each yican assume a k1...K
value.
One can consider the problem of assigning a label as a probability distribution
P(Y|X) where each label yihas a probability to be of each possible class from the
set of classes. With this, we are interested on finding the maximum a-posteriori
distribution (MAP), that is the optimal solution given a model, i.e. the set of
labels that has the maximum probability. This is showed by:
Y=arg max P(Y|X).(1)
The main difficulty for the MAP is the high complexity of modelling, i.e. to
learn the probability distribution, also leading to a high computational cost to
compute Y[17].
230 F. Codevilla et al.
An common solution to approximate P(X|Y) is by the CRF model that uses
a graph representation of the image X.LetG(υ, ε) be the representation of
the image, being υthe set of vertex and εthe set of edges. According to the
Hammersley-Clifford theorem [10] the probability P(X|Y) can be written as a
normalized exponential of a energy function E(X). The energy function of a the
graph Gis a function of an unitary factor (ϕu
i) and a local factor (ϕL
ij ). Hence,
we define the energy of a certain set of labels Y=(y1, ..., yn) given a model
graph Gand a set of parameters θuand θlas:
E(X|G, θu
l)=wu
u
yiX
ϕu
i(yi
u)+wl
(yi,yj)ε
ϕL
ij (yi,y
j
l),(2)
where the θparameters are associated with the spatial likelihood training. The
optimal labelling assignment (MAP) of Eq. 2is related to minimizing the energy
function Y=arg min E (X|G, θu
l). However, direct minimization is unfeasi-
ble due to the extremely large domain of E(X)[5].
Algorithmic approximations have been proposed to tackle this problem, such
as belief propagation (BP) [18], which computes an approximation of the MAP
for a given model graph. Each node patch label probability pi(yi) will be updated
by a message passing algorithm that takes into consideration both unary and
local probability distributions:
pi(yi)= 1
Zϕu
i(yi)
jN(i)
mji(yi),(3)
where mji(yi) are the messages from jto i,N(i) is the set of all the nodes
neighbouring i,andmji(yi) is computed as:
mij (yj)
yi
ϕu
i(yi)ϕL
ij (yi,y
j)
kN(i)\j
mki(yi).(4)
The belief propagation passes messages until a convergence condition is
obtained.
3 Proposed Approach
We show that the same maximum a-posteriori problem of Eq. 1(MAP) can be
solved by the application of a Sequential Indicator Simulation (SIS) adapted from
Geostatistical reservoir simulation methods [7]. We call it the GeoSim technique.
To simulate the probability of a certain label y0in a patch x0, for a given
class kfrom the set K, a certain number Nof random sample positions xαare
computed around the position x0in a radius r. An iterative function is applied
on the image lattice Xuntil convergence is obtained:
P(y0=k|X)(n)=(P(y0=k|X)(n1) )gp(1
Z
N
α=1
K
j=1
P(yα=j|X)(n1)wjk,α)gl(5)
Geostatistics for Context-Aware Image Classification 231
where P(y0=k|X)(n)is the probability of a given patch x0to be labelled as
class kat a certain iteration n. The equation is divided into two parts: (1) the left
fraction, (P(y0=k|X)(n1))gpis related to the previous probability distribution
and is weighted by the constant gp. (2) the right fraction (weighted by gl), is
related to the sampled positions used to add the spatial context information,
being P(yα=j|X)(n1) the probability of a sampled position xα.Zis the
normalization factor.
The probability for the basis (n= 0) is the prior unary probability, obtained
by using a trained classifier:
P(y0=k|X)(0) =Pu(y0=k|X).(6)
The function wjk,α is a weighted transition function that is related to the
probability of patch xαof class jbeing at a distance h=d(x0,x
α) from a patch
x0of class k,so:
wjk =tjk(h)u(x0,x
α),jK,kK, (7)
where tjk(h) is the transition probability function for a pair of classes over a
distance h. An heuristic function u(x0,x
α) is used to weight the transition.
We computed u(x0,x
α) as a weighting function that assigns larger weights to
smaller distances d(x0,x
α). For some Geostatistics applications the weight is
related to the a estimated variogram [7].
As showed by [7], the transition matrix Tcan be modelled as a exponential
function of distance hφunder a direction φ.
T(hφ)=eRφhφ(8)
We assume that the transition function is the same regardless of the direction
φ, which is valid for the cases where the objects to be classified are approximately
isomorphic. Figure 1shows an example of the matrix Tcontaining the transition
function tjk(h), for each pair of classes, on a three classes example.
Tcan be estimated by measuring the matrix Rfrom a set of annotated
training data [7]. The rate of transition between a pair of classes jand kare a
measure proportional to the average size of the classes and to the frequency of
transition between the pairs. Where each element of R,rjk, is computed as:
rjk =fjk
Ljk=jfjk
.(9)
The Lj, in Geostatistics, is computed as the mean length of a certain class.
By considering the objects as isomorphic shapes, we consider the mean length as a
2D mean radius Lj. We approximate the shapes as circles and we define the shapes
by a single radius parameter. With this approximation, we compute the radius of
all circles and subsequently compute the mean radius Land variance V.
The transition frequency fjk in Eq. 9is computed as the number of times of
the class joccurs and class koccur with a tolerance radius determined by the
mean radius Ljand Lk, and the variances Vjand Vk.
232 F. Codevilla et al.
Fig. 1. Example of transition probabilities for a 3 classes case. Each plot shows the
transition probabilities (yaxis) as function of the distance (xaxis, in pixels) from each
class (A,B,C) to all classes.
The transition matrix Tis computed from Rusing eigenvalue analysis [1],
as:
T(h)=eRh =
K
i=1
eλkhZk,(10)
where λkand Zkdenote the eigenvalues and the spectral component matri-
ces of R. The spectral components matrix can be directly computed from the
eigenvalues, as
Zk=m=k(λkIR)
m=k(λmλk)k=1, ..., K. (11)
3.1 GeoSim Algorithm
Given a trained transition probability matrix Tand a image with the prior unary
probabilities already computed, the following algorithm computes Eq. 5.
While error()x  X > thresh increase n:
Select a position x0pseudo-randomly, with preference on selecting positions
where max(P(y0|X)(n1)) is smaller;
Sample Npositions xαaround x0in radius r;
Obtain the wjk for each pair of positions x0,x
α, following Eq. 7;
Compute the P(y0=k|X)(n)for each possible class kusing Eq. 5;
Finally, the algorithm selects the class kwith maximum probability for
the function P(y=k|X) for each position xi. The error is monotonically
reduced through each additional iteration. However, the convergence rate is
greatly dependant on the weights gpand gl. The larger the magnitude of gp,
less the spatial information considered by each iteration and therefore the faster
the convergence of the solution.
Figure 2illustrates the algorithm sequence. The uncertainty, represented by
color intensity is reduced over the iterations reaching into a final classification
(Fig. 2c).
Geostatistics for Context-Aware Image Classification 233
3.2 Quenching Step
The entire results are based on the initial configuration of probability Pu(Y|X)
(Eq. 6). If there are considerable tendencies on a patch being assigned to a wrong
class, this tendency can propagate on the map culminating into a decrease of
accuracy.
AQuenching step is employed to minimize the image disagreement with the
measured Land proportions i.e. the prior probability of a class appearing on
the training set [8].
In a conceptually similar way to simulated annealing algorithms, we induce
perturbations after some iterations, by changing some classes probabilities dis-
tributions. These perturbations are done mainly on classes for which the distri-
butions do not correlate to the previously trained mean radius and proportions.
Figure 2b shows a quenching iteration. For this case, the yellow class had, on
the basis of the training set, an average radius of 56 pixels and a proportion of
0.05 % of the scene. For the Fig. 2b, the proportions and average size was about
of about 15 times higher. The changes were mainly done on this class, allowing
the GeoSim algorithm to propagate less information of this class.
3.3 Relation to Previous Works
The proposed method is closely related to the belief propagation method used
with CRF. The similarity in the methods stands from the fact that both propa-
gate context information based on prior assumptions and trained local relation-
ships. The two main differences between GeoSim and Belief Propagation are:
(1) the GeoSim method uses direct longer range interactions between nodes,
but with loosen connection properties, i.e. just Nsampled nodes are connected.
(2) The GeoSim approach also incorporates the idea that context comes not only
from between-objects but also from object size and scene proportions.
4 Experiments
4.1 Datasets
We tested the algorithms on two different seabed datasets. The datasets consist
of mosaics obtained by merging several hundred digital still images.
The Redsea dataset contains images captured in very shallow waters close to
the city of Eilat, as part of a survey of coral reef ecology [19]. For the classifi-
cation, we considered five classes: Urchin,Branching Coral,Brain Coral,Favid
Coral and the Background. We used one mosaic of 3256 ×2939 pixels for vali-
dation and the spatial likelihood training for both CRF and GeoSim. Another
mosaic of dimensions 3256 ×2939 pixels was used for testing. The mosaic was
created with a resolution of 1.1 pixel/mm.
The Marker dataset was captured in the Bahamas. We divided this set into
General Corals, Gorgonians (sea fans), Sand and the Background. We use one
mosaic of 2592 ×3963 for validation and the spatial likelihood training for both
CRF and GeoSim. Another mosaic of 2592 ×3963 pixels was used for testing.
The mosaic was captured in a 2.2 pixels/mm resolution.
234 F. Codevilla et al.
Fig. 2. Different iterations on the classification step for a five classes example (yellow,
green, blue, magenta and black). Each color represent a different class. The less mixed
is the color, the higher is the probability of a patch assuming a certain class, i.e. there
is a class kfor where P(yi=k|X)(n)is close to 1 for kand close to zero for all the
other classes. (a) Shows the initial configuration of the map where all positions xiare
equal to the unary probability distribution Pu(yi=k|X) . (b) Shows the map after
quenching step application, the yellow class was reduced since it did not attend to the
proportions and class size. (c) Shows the final result after convergence (Color figure
online).
4.2 Testing Configuration
The used test configuration to generate the unary probabilities is based on
the framework by Shihav et al. [15]. The underwater environment tends to pro-
duce color non-uniformity and subdues the overall contrast. For the cases where
the images are not severely degraded, like in our test data, the application of
simple methods for contrast correction and color correction are sufficient to facil-
itate the use of the texture and color features. The main selected features are
based on a mix of texture based features. We used a mix of Gabor filters, CLBP
and GLCM [15] feature kernels.
By using the set of textures descriptors , we compute the Pu(yi|X) based on
the confidence function training as explained by [2]. This is computed for each
patch, where each is a superpixel computed with the TurboPixels algorithm [12].
The confidence curve is computed in order to produce a more meaningful unary
probability distribution.
After having Pu(yi|X), the algorithm of Sect. 3.1 can be applied to compute
the the results. For the weights gpand glwe used respectively 1.5and1.
We compared our implementation with a CRF incremented with the Potts
potential [16].
4.3 Results
Figures 4and 3, shows the results for both the Redsea and the Marker datasets.
Each color represent a different class of seabed object being classified, black
begin the background.
Geostatistics for Context-Aware Image Classification 235
We can see the advantages of the more specific assumptions and richer sta-
tistics considerations of the GeoSim method as compared to the CRF approach.
This is seen specially for the Marker dataset (Figs. 3a,b,d,e) where the results
for GeoSim were about seven percentage points larger than CRF. As discussed in
Lucchi et. al. [13], the common probabilistic graphical models normally assure no
more than local smoothness. However, we can perceive for the GeoSim method
that the context measures such as sizes and proportions helped to improve the
(a) 0.5275 Unary (b) 0.6121 GeoSim
(c) 0.599 GeoSimQ (d) 0.5401 CRF
(e) Ground Truth
Fig. 3. The final results for the Marker seabed dataset. The figures show the data
classified by a color label. The dataset contains four different classes: Background
(black), Gorgonians (yellow), Sand (green) and General Corals (blue). In this case, we
can perceive a significant improvement of the GeoSim (in (b)) when compared to the
CRF model (in (d)) (Color figure online).
236 F. Codevilla et al.
(a) 0.75367 Unary (b) 0.76594 GeoSim
(c) 0.78449 GeoSimQ (d) 0.77697 CRF
(e) Ground Truth
Fig. 4. The final results for Redsea seabed dataset. The figures show the data classi-
fied by a color label. The dataset contains five different classes: Background (black),
Urchin(yellow), Branching Corals (magenta), Brain Corals (green) and Faviid Corals
(blue). We can see a better performance of the GeoSim method in (b) (Color figure
online).
Geostatistics for Context-Aware Image Classification 237
Table 1. Results of the average accuracy for multiple random samples of 1200 ×1200
pixels.
Algorithm Unary GeoSim GeoSimQ CRF
Accuracy RedSea 80.49 % 81.2%81.64%82.21%
Accuracy Marker 45.17 % 54.53%54.49%46.11 %
results. The green area on Fig. 3a was greatly reduced on the GeoSim results
(Fig. 3b). This happened since the green area did not respected the proportions
and sizes, allowing an increment on the classification accuracy. CRF, on the other
hand, in fact only enhanced the local smoothness, having marginal improvements
on accuracy (Fig. 3b). However, the Quenching step (GeoSimQ on Fig. 3c), was
not beneficial, since there was no large amount of patches with labels out of the
measured statistics with high probability for a single class.
For the Redsea dataset (Figs. 4a,c,d,e) there was less room for improve-
ment with context information, since the unary accuracy is higher. However,
the quenching step was able to improve the results. This happened since, the
perturbations reduced the yellow class size (Fig. 4c). This reduction is expected
because the Urchin class is usually small (This can be perceived at the Ground
Truth 4e).
In order to simulate more visual variability, we tested our algorithm with
multiple patches from the mosaics dataset. We cropped 15 randomly sampled
square patches of size 1200 ×1200 pixels and averaged their computed accuracy
for a different set of methods. The Table 1shows the results. We can still perceive
the best results for the GeoSim method over the CRF on the Marker dataset.
For the Redsea dataset, the differences were marginal. Again we perceived the
benefits of the quenching mainly for the Redsea dataset.
5 Conclusions
In this work we presented a novel method for context aware image classification
called GeoSim. The method is inspired by the techniques of spatial uncertainty
modelling. Our method is designed to work in a specific scenario that has low
scale variance. Examples of applications that have this properties are oceanic,
aerial and satellite mapping of natural scenes.
For these conditions, GeoSim was able to obtain significantly better results
than CRF, which was only able to enforce local smoothness. It contributes to
the field by being the first to combine techniques from two distinct topics, Geo-
statistics and Probabilistic Graphical Models, and to illustrate its benefit with
respect to the state-of-the-art.
As a future work, this method will be compared with more complex forms of
context addition such as the fully connected CRF [11] and auto-context [17].
238 F. Codevilla et al.
Acknowledgements. The authors would like to thank to the Brazilian National
Agency of Petroleum, Natural Gas and Biofuels(ANP), to the Funding Authority for
Studies and Projects(FINEP) and to Ministry of Science and Technology (MCT) for
their financial support through the Human Resources Program of ANP to the Petro-
leum and Gas Sector - PRH-ANP/MCT.
This paper is also a contribution of the Brazilian National Institute of Science and
Technology - INCT-Mar COI funded by CNPq Grant Number 610012/2011-8.
Additional support was granted by the Spanish National Project OMNIUS
(CTM2013-46718-R), and the Generalitat de Catalunya through the TECNIOspring
program (TECSPR14-1-0050) to N. Gracias.
References
1. Agterberg, F.: Mathematical geology. In: General Geology. Encyclopedia of
Earth Science, pp. 573–582. Springer, US (1988). http://dx.doi.org/10.1007/
0-387-30844-X 76
2. Aßfalg, J., Kriegel, H.-P., Pryakhin, A., Schubert, M.: Multi-represented classi-
fication based on confidence estimation. In: Zhou, Z.-H., Li, H., Yang, Q. (eds.)
PAKDD 2007. LNCS (LNAI), vol. 4426, pp. 23–34. Springer, Heidelberg (2007)
3. Beattie, C., Mills, B., Mayo, V.: Development drilling of the tawila field, yemen,
based on three-dimensional reservoir modeling and simulation. In: SPE Annual
Technical Conference, pp. 715–725 (1998)
4. Biederman, I., Mezzanotte, R.J., Rabinowitz, J.C.: Scene perception: detecting
and judging objects undergoing relational violations. Cogn. Psychol. 14(2), 143–
177 (1982)
5. Boix, X., Gonfaus, J.M., van de Weijer, J., Bagdanov, A.D., Serrat, J., Gonz`alez, J.:
Harmony potentials. Int. J. Comput. Vision 96(1), 83–102 (2012)
6. Carbonetto, P., de Freitas, N., Barnard, K.: A statistical model for general con-
textual object recognition. In: Pajdla, T., Matas, J.G. (eds.) ECCV 2004. LNCS,
vol. 3021, pp. 350–362. Springer, Heidelberg (2004)
7. Carle, S.F., Fogg, G.E.: Transition probability-based indicator geostatistics. Math.
Geol. 28(4), 453–476 (1996)
8. Deutsch, C.V., Journel, A.G., et al.: The application of simulated annealing to
stochastic reservoir modeling. SPE Adv. Technol. Ser. 2(02), 222–227 (1994)
9. Galleguillos, C., Belongie, S.: Context based object categorization: a critical survey.
Comput. Vis. Image Underst. 114(6), 712–722 (2010)
10. Grimmett, G.R.: A theorem about random fields. Bull. Lond. Math. Soc. 5(1),
81–84 (1973)
11. Kr¨ahenb¨uhl, P., Koltun, V.: Efficient inference in fully connected CRFS with
Gaussian edge potentials. In: Shawe-Taylor, J., Zemel, R.S., Bartlett, P.L.,
Pereira, F., Weinberger, K.Q. (eds.) Advances in Neural Information Processing
Systems, vol. 24, pp. 109–117. Curran Associates, Inc (2011)
12. Levinshtein, A., Stere, A., Kutulakos, K.N., Fleet, D.J., Dickinson, S.J.,
Siddiqi, K.: Turbopixels: fast superpixels using geometric flows. IEEE Trans. Pat-
tern Anal. Mach. Intell. 31(12), 2290–2297 (2009)
13. Lucchi, A., Li, Y., Boix, X., Smith, K., Fua, P.: Are spatial and global constraints
really necessary for segmentation? In: IEEE International Conference on Computer
Vision (ICCV), pp. 9–16. IEEE (2011)
Geostatistics for Context-Aware Image Classification 239
14. Purkis, S., Vlaswinkel, B., Gracias, N.: Vertical-to-lateral transitions among creta-
ceous carbonate facies: a means to 3-d framework construction via markov analysis.
J. Sediment. Res. 82(4), 232–243 (2012)
15. Shihavuddin, A., Gracias, N., Garcia, R., Gleason, A.C.R., Gintert, B.: Image-
based coral reef classification and thematic mapping. Remote Sens. 5(4), 1809–1841
(2013). http://www.mdpi.com/2072-4292/5/4/1809
16. Shotton, J., Winn, J., Rother, C., Criminisi, A.: Textonboost for image understand-
ing: multi-class object recognition and segmentation by jointly modeling texture,
layout, and context. Int. J. Comput. Vision 81(1), 2–23 (2009)
17. Tu, Z.: Auto-context and its application to high-level vision tasks. In: IEEE Con-
ference on Computer Vision and Pattern Recognition. CVPR 2008, pp. 1–8. IEEE
(2008)
18. Yedidia, J.S., Freeman, W.T., Weiss, Y., et al.: Generalized belief propagation.
In: Leen, T.K., Dietterich, T.G., Tresp, V. (eds.) Advances in Neural Information
Processing Systems, vol. 13, pp. 689–695. MIT Press (2001)
19. Zvuloni, A., Artzy-Randrup, Y., Stone, L., Kramarsky-Winter, E., Barkan, R.,
Loya, Y.: Spatio-temporal transmission patterns of black-band disease in a coral
community. PLoS One 4(4), e4993 (2009)
... However, in this way, we have the advantage of module based improvement and necessary adjustments. The presented method can also be deployed after necessary adjustments for other imaging modalities such as inspection images [21], [22], [23], nature images [24], [25], [2], [26], regular outdoor images [27], etc where detection and segmentation of the object of interest is the aim of the model and the annotated dataset is scarce. ...
Conference Paper
Though there had been much new deep learning based nuclei segmentation architectures available for various applications made by researchers in the last decade, most of them are specific for particular nuclei types of imaging modalities. In this work, we are proposing a step-wise pipeline for nuclei segmentation comprising Unet based difference image extraction, Faster-RCNN for nuclei detection on difference image, and SegNet for Nuclei segmentation from the detected nuclei inside bounding boxes. Our proposed algorithm provides significant improvement when it comes to Mean Average Precision vs Mask-RCNN and provides a general framework where advantages of transfer learning could be deployed easily. Due to the use of cascaded approach, each stage has a single cost function to minimize which helps to reach the global optimum with a limited number of training images and could be separately modified as necessary.
... Image restoration for participating media is not a new research problem [40]. For example, this problem is usually faced in robotics applications where artificial vision systems are often adopted as the main sensing device [13] to deal with tasks such as classification [10], mapping [6], 3D reconstruction [12], visualization [15], docking [36], tracking [17] [29] and robot localization by itself [2]. Furthermore, images taken in aerial environments with participating media, such as disaster zones (i.e., with the presence of smoke) or hazy weather can hamper the performance of navigation systems in autonomous vehicles based on vision, such as drones. ...
Article
Full-text available
Modern imaging devices can capture faithful color and characteristics of natural and man-made scenes. However, there exist conditions in which the light radiated by objects cannot reach the camera’s lens or it is naturally degraded. Thus, the resulting captured images suffer from color loss. This article addresses the problem of underwater image restoration by using an optics-based formulation to model the interaction between light and any underwater suspended particle. Our approach uses a factorial Markov random field (FMRF) to reformulate and solve the general nonlinear participating media optical model. This novel formulation also has the particularity of considering attenuation coefficients, beside global light, as to probabilistic latent variables, inferred from a single image. Due to this unique feature, our FMRF methodology for itself is enough to deal with images acquired in underwater scenes. The generality of our optical model makes it applicable in other participating media such as fog or haze, more commonly addressed in the current literature. Results have shown the capabilities to improve the degraded images using our methodology in several scenarios.
... A model inspired by Geo-statistics [14] to model spatial uncertainties has been introduced in a way to compute the labels of mosaic images with context label agreement using a transition probability model to enforce spatial properties such as class size and proportions. Zheng et al. [16] introduced a depth representation for RGB-Depth scene classification based on CNN. ...
Article
Breast histology image classification is a crucial step in the early diagnosis of breast cancer. In breast pathological diagnosis, Convolutional Neural Networks (CNNs) have demonstrated great success using digitized histology slides. However, tissue classification is still challenging due to the high visual variability of the large-sized digitized samples and the lack of contextual information. In this paper, we propose a novel CNN, called Multi-level Context and Uncertainty aware ( MCUa ) dynamic deep learning ensemble model. MCUa model consists of several multi-level context-aware models to learn the spatial dependency between image patches in a layer-wise fashion. It exploits the high sensitivity to the multi-level contextual information using an uncertainty quantification component to accomplish a novel dynamic ensemble model. MCUa model has achieved a high accuracy of 98.11% on a breast cancer histology image dataset. Experimental results show the superior effectiveness of the proposed solution compared to the state-of-the-art histology classification models.
... A model inspired by Geo-statistics [14] to model spatial uncertainties has been introduced in a way to compute the labels of mosaic images with context label agreement using a transition probability model to enforce spatial properties such as class size and proportions. Zheng et al. [16] introduced a depth representation for RGB-Depth scene classification based on CNN. ...
Preprint
Full-text available
Breast histology image classification is a crucial step in the early diagnosis of breast cancer. In breast pathological diagnosis, Convolutional Neural Networks (CNNs) have demonstrated great success using digitized histology slides. However, tissue classification is still challenging due to the high visual variability of the large-sized digitized samples and the lack of contextual information. In this paper, we propose a novel CNN, called Multi-level Context and Uncertainty aware (MCUa) dynamic deep learning ensemble model.MCUamodel consists of several multi-level context-aware models to learn the spatial dependency between image patches in a layer-wise fashion. It exploits the high sensitivity to the multi-level contextual information using an uncertainty quantification component to accomplish a novel dynamic ensemble model.MCUamodelhas achieved a high accuracy of 98.11% on a breast cancer histology image dataset. Experimental results show the superior effectiveness of the proposed solution compared to the state-of-the-art histology classification models.
... Computer vision has been utilized broadly to achieve various underwater robotics tasks such as: habitat and animal classification [1], mapping [2], 3D scene reconstruction [3], visualization [4], docking [5], tracking [6], inspection [7] and localization [8]. Nevertheless, sonar-based sensors are usually adopted in large vehicles [9], and the usual approach to obstacle avoidance in these vehiclies [10]. ...
Conference Paper
Full-text available
This paper describes a vision-based obstacle avoidance strategy using Deep Learning for Autonomous Underwater Vehicles (AUVs) equipped with simple colored monocular camera. For each input image, our method uses a deep neural network to compute a transmission map that can be understood as a relative depth map. The transmission map is estimated for each patch of the image to determine the obstacles nearby. This map enable us to identify the most appropriate Region of Interest (RoI) and to find a direction of escape. This direction allows the robot to avoid obstacles by performing a control action. We evaluate our approach in two underwater video sequences. The results show the approach is able to successful find a RoI that avoids coral reefs, fish, the seafloor and any other object present in the scene.
... Vision-based sensors have been extensively used in many underwater robotic applications such as habitat and animal classification [2], mapping [3], 3D scene reconstruction [4], visualization [5], docking [6], tracking [7], inspection [8] and robot localization [9]. However, very few works have addressed the vision-based obstacle avoidance problem in the underwater domain as it is usually solved with sonarbased sensors [10]. ...
Conference Paper
Full-text available
In this paper we propose a new vision-based obstacle avoidance strategy using the Underwater Dark Channel Prior (UDCP) that can be applied to any Unmanned Underwater Vehicle (UUV) equipped with a simple monocular camera and minimal on-board processing capabilities. For each incoming image, our method first computes a relative depth map to estimate the obstacles nearby. Then, the map is segmented and the most promising Region of Interest (RoI) is identified. Finally, an escape direction is computed within the RoI and a control action is performed accordingly to avoid the obstacles. We tested our approach on a video sequence in a natural environment and compared it against a state-of-the-art method showing better performance, specially in light changing conditions. We also provide online results on a low-cost Remotely Operated Vehicle (ROV) in a controlled environment.
... Visual quality is fundamental for several underwater robotics applications, such as habitat and animal classification [1], mapping [2], 3D scene reconstruction [3], visualization [4], docking [5], tracking [6], inspection [7] and robot localization [8]. We are particularly interested in the monitoring of coral reefs. ...
Conference Paper
Full-text available
For underwater robotics applications involving monitoring and inspection tasks, it is important to capture quality color images in real time. In this paper, we propose a statistically learning method with an automatic selection of the training set for restoring the color of underwater images. Our statistical model is a Markov Random Field with Belief Propagation (MRF-BP). The quality of the results depends strongly on the trained correlations between the degraded image and its corresponding color image. However, it is not possible to have color ground truth data given the inherent conditions of underwater environments. Thus, we build a color adaptive training set by applying a multiple color space analysis to those frames that present a high change in its distribution from the previous frame and use only those frames for training. Experimental results in real underwater video sequences demonstrate that our approach is feasible, even when visibility conditions are poor, as our method can recover and discriminate between different colors in objects that may seem similar to the human eye.
Conference Paper
Geostatistics and the theory of regionalized variables have been increasingly found useful in many applications of image processing. The semivariogram is the cornerstone of this spatial statistics, which can be implemented as an effective feature for pattern classification. This paper introduces a kriging-based distortion approach, and explores several well-known methods for pattern matching of the empirical semivariograms, including the spectral distortion measures, dynamic-time warping, and sample entropy. The findings provide insights into the utilization of several algorithms for the semivariogram-based pattern comparison under different settings. In particular, the usefulness of the proposed approach applies to situations where samples are limited, making hindrances for many pattern classifiers, which usually rely on some certain amount of sufficient training data.
Article
Full-text available
This paper presents a novel image classification scheme for benthic coral reef images that can be applied to both single image and composite mosaic datasets. The proposed method can be configured to the characteristics (e. g., the size of the dataset, number of classes, resolution of the samples, color information availability, class types, etc.) of individual datasets. The proposed method uses completed local binary pattern (CLBP), grey level co-occurrence matrix (GLCM), Gabor filter response, and opponent angle and hue channel color histograms as feature descriptors. For classification, either knearest neighbor (KNN), neural network (NN), support vector machine (SVM) or probability density weighted mean distance (PDWMD) is used. The combination of features and classifiers that attains the best results is presented together with the guidelines for selection. The accuracy and efficiency of our proposed method are compared with other state-of-the-art techniques using three benthic and three texture datasets. The proposed method achieves the highest overall classification accuracy of any of the tested methods and has moderate execution time. Finally, the proposed classification scheme is applied to a large-scale image mosaic of the Red Sea to create a completely classified thematic map of the reef benthos.
Article
Full-text available
The search for, and extraction of, hydrocarbons in carbonate rocks demands a thorough understanding of their depositional anatomy. The complexity of carbonate systems, however, hinders detailed direct characterization of their volumetric heterogeneity. Information with which to construct a reservoir model must therefore be based on information gathered from wells or outcrops transecting the sequence of interest. Most (particularly exploration wells) are vertical, presenting a problem for geostatistical modeling. While understanding vertical stratal stacking is straightforward, it is difficult to obtain lateral facies information. Though in some situations outcrop surfaces, seismic data, and horizontal wells may somewhat mitigate this bias, the likelihood remains that the lateral dimension of a buried system will be vastly undersampled with respect to the vertical. However, through the principle of Walther's Law (Walther 1894) or due to the geometry of basinward-inclined beds, comparable facies frequencies and transition probabilities may link vertical and lateral stratal arrangements, the implication being that a reservoir model, competent at least in terms of transition statistics, could be built against information harvested down-core. Taking an interpreted outcrop panel from Lewis Canyon (Albian, Pecos River, Texas), we use Markov-chains to first ascertain that vertical and lateral stratal ordering is nonrandom. Second, we show lithofacies transition probabilities in the outcrop as being interchangeable between the vertical and lateral directions. The work concludes by demonstrating the utility of an existing 3-D Markov random field simulation to volumetrically model the Lewis Canyon outcrop on the basis of vertical facies transition tendencies. Statistical interrogation of the 3-D model output reveals the simulation to contain realistic facies associations compared to the outcrop. This suggests that the reconstruction process, based on Markov chains, produces a useful representation of 3-D heterogeneity in this Lower Cretaceous carbonate succession. Markov random field simulation might provide an important tool for prediction and simulation of subsurface carbonate reservoirs.
Article
Full-text available
The hierarchical conditional random field (HCRF) model have been successfully applied to a number of image labeling problems, including image segmentation. However, existing HCRF models of image segmentation do not allow multiple classes to be assigned to a single region, which limits their ability to incorporate contextual information across multiple scales. At higher scales in the image, this representation yields an oversimplified model since multiple classes can be reasonably expected to appear within large regions. This simplified model particularly limits the impact of information at higher scales. Since class-label information at these scales is usually more reliable than at lower, noisier scales, neglecting this information is undesirable. To address these issues, we propose a new consistency potential for image labeling problems, which we call the harmony potential. It can encode any possible combination of labels, penalizing only unlikely combinations of classes. We also propose an effective sampling strategy over this expanded label set that renders tractable the underlying optimization problem. Our approach obtains state-of-the-art results on two challenging, standard benchmark datasets for semantic image segmentation: PASCAL VOC 2010, and MSRC-21.
Article
Full-text available
Traditionally, spatial continuity models for indicator variables are developed by empirical curvefitting to the sample indicator (cross-) variogram. However, geologic data may be too sparse to permit a purely empirical approach, particularly in application to the subsurface. Techniques for model synthesis that integrate hard data and conceptual models therefore are needed. Interpretability is crucial. Compared with the indicator (cross-) variogram or indicator (cross-) covariance, the transition probability is more interpretable. Information on proportion, mean length, and juxtapositioning directly relates to the transition probability: asymmetry can be considered. Furthermore, the transition probability elucidates order relation conditions and readily formulates the indicator (co)kriging equations.
Article
Canadian Petroleum Ltd. and partners in the Yemen Masila Block have successfully used detailed three-dimensional reservoir modeling and reservoir simulation to optimize the development of the larger oilfields in the Masila area. The models were used to predict reservoir performance and plan additional development drilling which subsequently demonstrated that the models accurately predicted drilling results. The main producing horizon in the Masila area is the Cretaceous Upper Qishn formation, a clastic-dominated transgressive depositional sequence with fluvial sediments at the base, tidal dominated estuarine sediments in the middle, and marine shoals at the top. This variable array of facies presents modeling challenges but the resulting heterogeneous models provide a realistic representation of actual reservoir characteristics. This paper describes the approach used to stochastically distribute both facies bodies and petrophysical parameters, and to upscale the model for reservoir simulation, while preserving the complex reservoir description. The Tawila field was the first Masila field to have wells drilled on the basis of the modeling effort, with very encouraging results. For these new well locations, the model successfully predicted both reservoir development and oil- water contact movements resulting from production from existing wells. This paper presents key conclusions and predictions from the modeling and reservoir simulation, and compares them to the results from subsequent drilling. As a result of the successful development drilling, these models are now an integral part of reservoir management and development planning for all Masila fields. P. 715
Article
This paper describes the use of simulated annealing to construct numerical reservoir models that honor many different types of data. Summary Stochastic reservoir models must honor as much data as possible to be reliable numerical models of the reservoir under study. Traditional stochastic modeling techniques are ill-suited to reproduce complex geological/morphological patterns. The simulated annealing technique offers promise as a complementary tool to incorporate such information. Introduction Stochastic reservoir modeling is becoming commonly used to describe and visualize reservoir heterogeneities1–5. It involves the generation of 3-D images of the reservoir lithofacies and rock properties that, ideally, would honor all available data (core measurements, well logs, seismic and geological interpretations, analog outcrops, well test interpretations, etc.). Potentially, there are a large number of plausible realizations that honor such data. A few realizations are retained and processed through a flow simulator with the envisaged production scheme. The resulting distribution of important production response variables can then be used for decision making6–9. There is, however, no single stochastic modeling algorithm that can simultaneously honor all types of available information. Some algorithms are well suited for discrete or categorical information such as lithofacies types4–5; others are suited for information carried by continuous variables like porosity, saturation, and permeability2–3. Certain information, like production data or effective properties derived from well tests, cannot be easily incorporated into the reservoir model. Almost always, a stochastic reservoir modeling exercise will involve a hybrid technique combining the best features of a number of available algorithms. Simulated annealing is an algorithm initially developed for the solution of combinatorial optimization problems. The type of problem typically considered involves finding the optimum ordering of a system with a large number of components. An optimum ordering is one that minimizes some global cost or objective function. In the context of stochastic reservoir modeling, the components could be reservoir attribute values like porosity defined for blocks of constant size. The cost or objective function could be a measure of how close the ordering (spatial arrangement of the block porosity values) reproduces the pattern of spatial correlation (variogram) inferred from an outcrop study. Finding an optimum ordering is equivalent to generating a numerical model. Recent interest in using the simulated annealing technique for reservoir characterization was triggered by a paper written by C.L. Farmer10 The technique capitalizes on two new ideas. First, the imaging problem is set up as an optimization problem. Second, the optimization problem is solved with simulated annealing. This formalism allows accounting for diverse types of information by building objective functions more complex than merely identifying a variogram model. For example, one part of the objective function may call for matching a variogram model, a second part would be dedicated to matching a non-aligned 3 or 4-point statistic representative of a curvilinear feature of facies heterogeneities, e.g., crescent-shaped eolian dunes. Reasonable care must be taken to limit the complexity of the objective function, otherwise, the computational effort may become too large to obtain a solution in a practical amount of time. Annealing Background Posing stochastic simulation as an optimization problem calls first for a translation of the desired geological, statistical, and engineering properties of the reservoir model into some numerical quantities. Next, reference properties and corresponding numerical quantities must be established from data and/or control patterns. An objective function is defined as a weighted sum of differences between the properties of any simulated image and the previous reference values. The optimization problem consists of lowering the objective function enough so that the image has all or most of the desired properties. The solution of such an optimization problem is sometimes possible using the simulated annealing technique10. The central idea behind simulated annealing is an analogy with thermodynamics, specifically with the way liquids freeze and crystallize, or metals cool and anneal. At high temperatures the molecules can move freely. As the temperature is slowly lowered the molecules line up in crystals which represent the minimum energy state for the system. Metropolis and his coworkers11 developed the idea of numerically simulating molecular behavior. From concepts developed in thermodynamics and statistical physics it is known that a system will change from a configuration of energy E1 to a configuration of energy E2 with probabilityEquation The system will always change if E2 is less than E1 (i.e., a favorable step will always be taken); however, it may sometimes take an unfavorable step. The application of this probability distribution in the numerical simulation of systems composed of many parts has come to be known as the Metropolis algorithm. More generally, any optimization procedure that draws upon the thermodynamic analogy of annealing is known as simulated annealing. In the early 1980's Kirkpatrick et al.12 and independently Cerny13 extended these concepts to combinatorial optimization, i.e., they formulated an analogy between the objective function and the free energy of a thermodynamic system.14–15 A control parameter, analogous to temperature, is used to control the iterative optimization algorithm until a state with a low objective function (energy) is reached.
Article
The goal of object categorization is to locate and identify instances of an object category within an image. Recognizing an object in an image is difficult when images include occlusion, poor quality, noise or background clutter, and this task becomes even more challenging when many objects are present in the same scene. Several models for object categorization use appearance and context information from objects to improve recognition accuracy. Appearance information, based on visual cues, can successfully identify object classes up to a certain extent. Context information, based on the interaction among objects in the scene or global scene statistics, can help successfully disambiguate appearance inputs in recognition tasks. In this work we address the problem of incorporating different types of contextual information for robust object categorization in computer vision. We review different ways of using contextual information in the field of object categorization, considering the most common levels of extraction of context and the different levels of contextual interactions. We also examine common machine learning models that integrate context information into object recognition frameworks and discuss scalability, optimizations and possible future approaches.