Conference PaperPDF Available

A distance measure for structural descriptions using circular arcs as primitives

Authors:

Abstract

This paper proposes a structural description scheme using circular arcs as primitives. On this scheme, a metric for defining a distance between pairs of circular arcs and relations among them, is introduced and its main properties are discussed. This metric is based on a set of perceptive criteria which allow to increase its effectiveness in application domains characterized by high variability in the shape of the visual patterns. The whole approach is general enough to be satisfactorily used in a wide class of applications. The metric has been validated by employing it in a nearest neighbour classifier, which has been used for automatic recognition of handwritten digits extracted from a standard character database
A
Distance Measure
for
Structural Descriptions Using
Circular
Arcs
as
Primitives
C.
De Stefano,
P.
Foggia, F. Tortorella,
M.
Vento
Dipartimento di Informatica e Sistemistica
Universita degli
S
tudi
di
Napoli “Federico
11”
Via Claudio
2
1,1801
25
Napoli (Italy)
E-mail:
{
destefan, foggia, tortorel,
vento]@nadis.dis.unina.it
Abstract
This paper proposes a structural description scheme
using circular arcs
as
primitives. On this scheme, a
metric for defining a distance between pairs
of
circular
arcs and relations among them, is introduced and its
main properties are discussed. This metric is based on a
set
of
perceptive criteria which allow
to
increase its
effectiveness in application domains characterized by
high variability in the shape
of
the visual patterns. The
whole approach is general enough
to
be satisfactorily
used in a wide class
of
applications. The metric
has
been
validated by employing
it
in
a Nearest Neighbour
Classifier, which has been used for automatic recognition
of
handwritten digits extracted from a standard character
database.
1.
Introduction
In most of methods employed
in
system for
recognizing real images, structural descriptions are used
for representing a scene and/or the objects contained in it.
Attributed Relational Graphs
(ARGs)
are recognized as a
powerful
data
structure for supporting structural
descriptions of visual patterns, and have
been
widely used
in many applicative contexts
[1,2].
In
an Attributed
Relational Graph the nodes correspond
to
the primitives
of the structural description and are labeled by the
attributes characterizing the primitives themselves;
similarly, the branches represent the relations among
primitives.
It is well known that, in real applications,
deformations
on
samples, due to noise or distortion, may
cause an input sample to be different from all the class
prototypes: these distortions can affect
both
the structural
primitives and their relations. For this reason, the
recognition process cannot be carried out by conventional
graph isomorphisms, even though the input pattem
is
deformed very slightly and can be easily recognized by a
human. To overcome these limitations, several
approaches have been proposed which determine a
distance measure between smctural descriptions
supported by ARGs. In particular, in
[3]
the distance
measure is based
on
the evaluation of the minimum
number of transformations
to
apply
to
one graph to obtain
the other one. All the mentioned approaches, however,
focus their attention
on
the reduction of the
computational cost
of
the distance calculation, which
involves a
NP
complexity process. A problem that these
methods do not attempt to resolve is the definition of a
distance between two given primitives and between
two
relations. In fact, all the cited papers assume that
distances
between
primitives and relations are given, and
use them to define a distance
between
whole objects. It
is
worth noting that the definition of a distance for
primitives and their relations is not,
in
general, an easy
task; its difficulty, of course, may
vary
in
dependence of
the used primitives. An example of such definition is
given in
[4],
in which a distance between polygonal
curves is proposed.
In this paper we propose a structural description
scheme using second order primitives,
in
particular
circular arcs. Two distance functions are introduced for
measuring the dissimilarity between two arcs and
two
relations, respectively; on the basis of these functions, a
distance between structural descriptions having circular
arcs as primitives
is
finally proposed. The effectiveness
of the distance has been experimentally evaluated by
using the description scheme for representing handwritten
characters that are then recognized by means of a
NN
classifier.
2.
The proposed method
In
defining a distance D(x,y) between structural
descriptions, an obvious aim is to template, in some way,
the human capability of perceiving the similarity between
two
different
objects,
so
as to provide small (large) values
when
comparing alike (different) pattems. However, this
property is very hard to achieve since it involves a deep
1015-4651/96 $5.00
0
1996
IEEE
Proceedings
of
ICPR
’96
290
knowledge of the perception processes performed by a
human when he/slie compares two shapes. With a more
realistic approach, our aim is
to
propose a scheme
in
which
the distance between structural descriptions
is
continuous
with
respect
to
the
perception:
in other words,
when the shape of one object is changed, the distance
variation should
be
consistent with the perceived entity
of
the modification.
To
obtain the aibove requirements, the distance D(x,y)
should be formulated on the basis of a careful definition
of both component attributes and distance functions
which measure the dissimilarity between such
components.
More
precisely,
on
one side the auributes
should comply
wifh
the shape features a
human
perceives
as
significant and, on the other side, should be able
to
follow,
as
much
as
possible in
a
continuous way, the
variations a given pattern could exhibit.
A
similar
approach should
lbe
adopted also for the design of the
distance functions. In the next paragraphs we will address
these topics with reference to a structural description
scheme using circiular
arcs
as primitives.
2.1
The
description attributes
From a geometric point of view, a circular arc can
be
completely identified
in
the (x,y) plane through
5
parameters (i.e., the coordinates of the center, the radius,
the starting and the ending angles). However, for our
aims,
it
is not important to exactly reconstruct the
original shape, but
to
select the shape attributes which,
from a perceptual point of view, seem to
be
the most
significant for distinguishing a given shape from similar
ones.
Furthermore, the set
of
chosen attributes should
exhibit a certain degree of
orthogonality,
i.e. the
characteristics described
by
one attribute should not
affect other attributes; this is essential to enhance
as
much as possible the descriptive power of the attributes.
Three attribules which comply with the outlined
requirements are the
dimension,
the
form
(if
the
arc
is
elongated
or
bent)l,
and the
orientation.
We assume as;
dimension
of
an
arc the length of the
longest side of ithe rectangle enclosing the arc. This
choice is based on the fact that the longest distance
between
two points of the primitive is perceived as more
significant
than
other dimensional parameters.
It
is easy
to verify that this length equals the length of the chord for
arcs whose subtending angle is smaller than
180'.
otherwise
it
is equal to the diameter of the arc. Actually,
in order
to
obtain a scale-invariant parameter, the
dimension is normalized with respect to the longest side
of the rectangle enclosing the whole pattern.
The parameter adopted to estimate the
form
of
an
angle is the
span,
i.e. the measure
of
the angle subtending
the arc itself:
it
takes into account that an arc can assume
different configurations ranging from the straight
segment
to
the closed circle, and allows to estimate the
similarity between arcs, without being affected by the
different sizes. The span for a straight segment is
assumed
to
be
0".
Finally,
the
Orientation
of
an
arc
is
measured by
means of the angle existing between a reference
axis
and
the vector orthogonal
to
the chord of the arc and having
the same direction of the concavity.
A
null orientation is
assigned
to
a closed circle (i.e. an arc having
a
span equal
to
360").
In fig.
1
it is shown an example of a circular arc
together with the considered attributes.
I.
span
orientation
Fig
1:
An
example
of
a circular arc together with
its
attributes.
In
the following, we will denote with dim@), span@)
and orient@), respectively, the dimension, span and
orientation of a given
arc
p.
Let
us
now
examine the attributes to adopt for
describing the relations existing among the primitives.
These attributes should allow
to
highlight the
way
in
which primitives
are
connected to
each
other, which
represents a really distinctive feature for a given shape.
We adopt as attribute characterizing the relation between
two circular arcs the position of the contact point,
measured along each arc: denoting with
p
and
q
two
connected circular arcs, the relation
R
between them can
be
expressed by the 4-tuple:
where
RXi
and
RYj
represent the coordinates of the
contact point on the primitive
i;
these coordinates are
evaluated
with
respect to a reference system having axes
parallel to
x
and
y
axes and origin located in the center of
the bounding box of the primitive
i
(i.e., the smallest
rectangle containing the primitive
i
and having sides
parallel to
x
and
y
axes). Moreover, choosing
as
reference
unit
on each axis the length of the projection of the
primitive on such axis,
RXi
and
RYi
assume values
in
the
range
[-OS,
0.51
(see figure
2).
R@,q)
=
(mp,
RYp,
RXq,
RYq)
291
-4.3
A
A
-I-
-e
-
-0.5
-‘Y?
,
,
RY2
-
-
0.5
RXI
=
0.25
RYI
=
0.15
RX2
=
-0.26
RY2
=
-0.1
Fig.
2:
Definition
of
relation attributes
2.2
Distance between attributes
Once
the attributes describing primitives and relations
have
been
defined, the next step is to formalize a distance
Dp
between primitives and a distance
DR
between
relations, both based
on
the attributes described before.
As
regards the primitive distance
Dp,
an effective
measure of the dissimilarity between two arcs
p
and
p’
can
be
obtained by evaluating
in
which way
we
should
warp
p
to make it identical to
p’.
It can be simply shown
that a generic transformation can be viewed as composed
by successive variations, each one involving one of the
primitive attributes. In this way, we can define the
distance
Dp(p,p’)
as:
(1)
where
Wd,
ws
and
w,
are
the
costs
for an unitary variation
of the dimension, span and orientation, respectively.
Expression
(l),
however, does not completely meet
the requirements the distance between primitives should
comply; actually, other important aspects should be taken
into account when
Dp
is evaluated.
To
better introduce
this topic, let
us
consider a suitable representation in
which the dimension, the span and the orientation are
respectively mapped
on
the radius, the latitude and the
longitude of a spherical coordinate system; the scale is
chosen
in
such a way that arcs with span equal
to
0”
(i.e.,
straight segments) are placed
on
the equator, while a
closed circle is mapped on a pole
(see
fig.
3).
Although
only the “northern” hemisphere could be sufficient for
representing the arcs through their attributes, we assume
that the span can have negative values,
too,
corresponding to a rotation of
180”
for the arc. This
implies that a given arc has two representations, placed at
antipodal points
on
the sphere:
for
example, Lhe arc with
dimension
1,
span
30”
and orientation
70”
is represented
by two points the
first
of
which is
on
the “northem”
hemisphere with spherical coordinates
(1,
30°,
70”),
while the second belongs to the “southern” hemisphere
with spherical coordinates
(1,
-30”,
250’).
The reason for
this assumption will be
clcar
shortly.
Dp(p,
p’)
=
Wd
Adim
+
w,
Aspan
+
w,
Aorient
2
Fig.
3:
The spherical coordinate
system.
In the spherical system adopted it can be easily
verified that every transformation is described by means
of a path connecting two points. More precisely,
in
a first
phase, we consider only the variation of dimension: this
is equivalent to move between two spheres, respectively
associated to the initial and final dimension, without
changing the other coordinates. Successively, the
variations in span and orientation
are
accomplished
by
moving
on
the
surface of the second sphere: it
is
worth
noting that,
in
this case, several paths can
be
found with
consequently different results on the evaluation of the
distance. It is then necessary to define how to choose the
most suitable sequence of variations, in
the
following
denoted as
route.
A
first kind of sequence consists in
modifying the orientation and then the span of the first
arc
to
match the parameters of the second arc: the
corresponding route
runs
through the hemisphere
on
which the starting point is placed and, for this reason,
will be called
hemispherical route.
The relative contribution to
the
whole distance is
given by:
(2)
where
w,
and
w,
are the relative costs.
The hemispherical route does not give adequate
results in case of arcs having very small span values,
similar dimension and opposite orientation.
In
fact,
though
they
are
very
similar
from
a
perceptual point
of
view, the difference in orientation reaches its maximum
and the corresponding hemispherical route covers half a
parallel.
In
these cases it is more convenient to consider,
for the first arc, the second representative point which is
closer to the second arc and then gives a more reliable
estimate. The route corresponding to this kind
of
transformation is called
equatorial,
since the path joining
DHEM
=
w,
I
Aspan
I
+
w,
I
Aorient
I
92
the considered points crosses the equator; the contribution
DEQU
to
the whole distance is given by the same
expression
(2).
prawided that the representative point on
the southem hemiqphere is used for the first arc.
Another case in which the hemispherical route fails
happens when the ispans of the two
arcs
are
very
close to
360":
the difference in orientation becomes negligible
since the two
arcs
are perceived
as
two broken
circumferences and then considered very similar. The
relative distance
is
measured along the
polar
route.
According to this rloute, the distance between the two arcs
is given by the expression:
DPOL
=
wp
(360"
-
I
spanil
+
360"
-
I
span21)
(3)
Due
to
its particular meaning, the polar route is used
only when both
span1
and
span2
are
greater than a fixed
threshold.
At this point we can formally define the distance
between two arcs
p
and
p'
by means of the following
expression:
where the last
term
in parentheses is actually considered
only when the spans of the two
arcs
satisfy the relative
condition.
Now, let us examine the formulation of the distance
between relations. Let
us
call
p
and
q
(p'
and
q')
two
connected primitives in the first (second) object;
the
distance
DR
between the relations
r=R(p,q)
and
r'=R(p',q')
can be expressed as:
DR(r, r')
=
w1 D1
+
w2
02
where
wl,w2
2
0
and
w1
+
w2
=
1;
D1
(02)
represents the
variation of the contact point relative to the primitives
p
and
p'
(qandq':). The coefficients
w1
and
w2
are
computed in such a way
to
assign a greater weight
to
the
variation of the coatact point relative to primitive having
larger dimension. This choice
is
related
to
the
consideration that a variation of the contact point along
the greater arc is generally perceived as a more
significant change of the shape.
Ddp,
P')
=
Wd
Mim
+
min
(DHEM
,
DEQU DPOL)
(4)
(5)
As
regards D1 and
D2,
their expressions are:
D1
=
wlX
I
RXp
-
RXp*
I
+
Wly
I
RYp
-
RYp.
I
02
=
wzX
I
RX,
-
AXq*
I
+
~2~
I
RYq
-
RYq.
I
(6)
(6')
where
Wix,
Wiy
2
Cl
and
Wix
+
Wiy
=
1;
they are defined in
such a way
that
the component corresponding to the
larger bounding
box
dimension is given a greater weight.
It should be noted that the latter two definitions are
not applicable
if
one of the two primitives is a straight
segment and is ncirmal to one of the axes:
in
this case, a
null
value is conventionally assigned to the coordinate of
the contact point allong that axis.
With
the given definitions,
DR
satisfies the following
properties:
a)
it assumes value
equal
to
zero if the contact point
between the primitives is located
in
the same position;
b)
it
exhibits a continuous trend as the position of the
contact point varies;
c)
it assumes the value
1
as
maximum value;
6)
it
evaluates separately the contribution relative
to
the
variation of the contact point for each couple
of
primitive.
2.3
The proposed distance measure
Now, let us consider the structural descriptions of two
objects
S
and
S',
and denote with
pi
and
rj
(pi'
and
r,')
the
i-th primitive and the j-th relation in
S (S*).
Starting from
the
definitions given in the previous paragraph, the
distance between
S
and
S'
is:
where
wp
and
WR
are two weight coefficients whose
values are fixed according to the purposes
of
the
particular application at hand. Expression
(7)
applies if it
is possible to find a correspondence for all the
components of
S
and
S'.
Since this
is
not always true, we
have
to
consider
also
the primitives and the relations for
which a correspondence
has
not
been
established: this
task
is
not
very
simple because
the
contribution given by
these components should be evaluated in dependence
of
the applicative context we refer to.
A
more general
formulation, which rakes into account the lacking
components, is given by the following expression:
W,
S')
=
WP
Zi
Dp
(pi
Pi')
+
WR
xj
DR (rj
,
rj')
(7)
D(S,
S')
=
WP
Ci
Dp
(pi
,Pi')
+
WR
Cj
DR (rj
*
rj')
+
[&I
%
h)
+
Zk
%
h')
1
+
WR
[&
LR
km)
+
&I
LR
trn')
1
(8)
where with
ph
h')
and
r,,, (r"')
are the primitives and the
relations of
S
(S')
that have no corresponding component
in the other object;
yg
and
WLR
are determined in
a
way
analogous to
wp
and
WR.
Lp
(LR)
evaluates the
contribution provided
by
a lacking primitive (relation)
as
a
function of its attributes. Obviously, their expressions
are dependent
on
the specific application.
3.
Experimental results and conclusions
Experimental results refer to the problem of automatic
recognition of unconstrained handwritten digits. This
applicative domain is particularly interesting both for the
high shape variability among samples belonging to a
class, and for the presence of samples belonging to
different classes that exhibit a high shape similarity. The
used handwritten digits have
been
extracted from the
ETLl
database (distributed
by
the Japanese Technical
Committee
for
Optical Character Recognition) and
preliminarily submitted to a filtering and a binarization
293
process. Characters are then represented
as
sets
of
circular arcs and described by Attributed Relational
Graphs according to a method whose details can be found
in
[5].
To
experimentally validate the proposed distance
measure, we used some parameters characterizing the
distribution of the samples with respect
to
the distance.
If
we define D’(C,C’) as the mean value of the distance
between
samples of the
class
C and their nearest
neighbours belonging to the class C’, the used evaluation
parameters are:
Dl (C) =D*(C,C);
DB
(C)
=
14Nc
-
1)
&*,
c*+c
D*(C,C’);
DN
(C)
=
mint,
c.+c
D*(C,C’);
where
Nc
is the number
of
the considered classes.
DI
gives an indication of the variability inside a class, while
DB
and
DN
allow us to measure the possibility of
confusion respectively with all the other classes and with
the nearest one.
With reference to a test-set composed of about
loo0
samples, we have computed the quantities
01,
DE,
DN
and the relative standard deviations. From the analysis of
the obtained results (cfr. fig.
4),
it can be
seen
that the
classes are acceptably separated from each other, and a
certain degree of overlapping
is
present only between
classes that are morphologically very similar.
Another interesting evaluation parameter for a
distance measure is the recognition rate obtained by using
a
NN
classifier. The classifier has been tested using a set
of about
loo0
randomly chosen prototypes and a different
test-set of about
1500
samples.
In
fig.
5
the classification
results for each class and the overall percentage of correct
classification (equal to
91.3%)
are shown.
It
should
be
noted that this is a
good
classification rate considering
that we have used unconstrained handwritten characters
with no contextual information.
eferences
[l]
M.A. Eshera.
K.S.
Fu
“An
Image Understanding System
using Attributed Symbolic Representation and Inexact
Graph Matching”
IEEE
Trans. on Pattern Analysis and
Machine Intelligence,
Vol. PAMI-8,
No.
5,
pp.
604-617.
1986.
J.
Racha and
T.
Pavlidis.
“A
shape analysis model with
applications
to
a character recognition system”,
IEEE
Trans. Pattern Analysis
and
Machine Intelligence,
Vol.
16,
no.
4.
pp. 393404.1994.
M.A. Eshera.
K.S.
Fu
“A Graph Distance Measure
for
Image Analysis”,
IEEE
Trans.
on
Systems,
Man
and
Cybernetics,
Vol.
SMC-14,
No.
3,
pp. 398-408. 1984.
[2]
[3]
[4] E.M. Arkin, L.P. Chew, D.P. Huttenlocher,
K.
Kedem,
J.S.B.
Mitchell,
“An
Efficiently Computable Metric for
Comparing Polygonal Shapes”,
IEEE
Trans. on Pattern
Analysis and Machine Intelligence,
Vol. 13.
No.
3,
L.P. Cordella. C.
De
Stefan0 and M. Vento, “A Neural
Network Classifier for OCR using Structural
Descriptions”,
Machine Vision
and
Applicatiom,
Vol. 8,
pp. 209-215,
1991.
[5]
n.
5,
pp. 336-342,
1995.
0.7
0.6
0.5
0.4
0.3
02
0.1
0
10’1
2
3
4
5
6
7
8
8
-0.1
Fig.
4:
Histograms of the quantities
DI
,
Ds
,
DN
relative to each class. The standard
deviation
of
the corresponding quantity
is added to each bar.
100%
95%
. . .
-
-.
.
90%
85%
80%
75% 0123456709
classes
Fig.
5:
Recognition rate and error rate for each
class. The dotted line represents the
recognition rate on the whole test set.
294
... By coupling a description method with a classi"er paradigm, we have built-up "ve di!erent experts: three of them employ the MLP classi"er with the BM, HA and MA descriptions (let us denote these experts with MLP- BM, MLP-HA and MLP-MA); the fourth is constituted by the LVQ classi"er with the BM description (LVQ- BM), and the last expert is a nearest-neighbor classi"er with the ARG description (NN-ARG). In this last case, we have used a metric de"ned in the ARG space [21]. All the tests were performed on the NIST database 19 [22] . ...
Article
In the present paper we propose a method for determining the best trade-o! between error rate and reject rate for a multi-expert system (MES) using the Bayesian combining rule. The method is based on the estimation of the reliability of each classi"cation act and on the evaluation of the convenience of rejecting the input sample when the reliability is under a threshold, evaluated on the basis of the requirements of the application domain. The adaptability to the given domain represents an original feature since, till now, the problem of de"ning a reject rule for an MES has not been systematically introduced, and the few existing proposals seldom take into account the requirements of the domain. The method has been widely tested with reference to the recognition of handwritten characters coming from a standard database. The results are also compared with those provided by employing the well-known Chow's rule.
... We have considered five different experts: three of them employ the MLP classifier with the BM, HA and MA descriptions (let us denote these experts with MLP-BM, MLP-HA and MLP-MA); the fourth is constituted by the LVQ classifier with the BM description (LVQ-BM), and the last expert is a Nearest Neighbor classifier with the ARG description (NN-ARG). In this last case, we have used a metric defined in the ARG space [17]. All the tests were performed on the NIST database 19 [18]. ...
Chapter
Full-text available
Recently, in the framework of Pattern Recognition, methods for combining several experts (Multi-Expert Systems, MES) in order to improve the recognition performance, have been widely investigated. A main problem of MES is that the combining rule should be able to take the right classification decision even when the experts disagree. Anyway, in critical cases, a reject decision is convenient to reduce the risk of an error. Up to now, the problem of defining a reject rule for a MES has not been systematically explored. We propose a method for determining the best trade-off between error rate and reject rate depending on the considered application domain, i.e. by taking into account the costs attributed, for the specific application, to misclassifications, rejects and correct classifications. Even though the method has general validity, in this paper its application to a MES using the Bayesian combining rule is presented.
... We have considered five different experts: three of them employ the MLP classifier with the BM, HA and MA descriptions (let us denote these experts with MLP-BM, MLP-HA and MLP-MA); the fourth is constituted by the LVQ classifier with the BM description (LVQ-BM), and the last expert is a Nearest Neighbor classifier with the ARG description (NN-ARG). In this last case, we have used a metric defined in the ARG space [17]. All the tests were performed on the NIST database 19 [18]. ...
Chapter
In the last years, great attention has been devoted to multiple classifier systems. The implementation of such a system implies the definition of a rule (combining rule) for determining the most likely class, on the basis of the class attributed by each single expert. The availability of a criterion to evaluate the credibility of the decision taken by a classifier can be profitable in order to implement the combining rule. We propose a method that, after defining the reliability of a classification on the basis of information directly derived from the output of the classifier, uses this information in the context of a combining rule. The results obtained by combining four handwritten character on the basis of classification reliability are compared with those obtained by using three different combining criteria. Tests have been performed using a standard handwritten character database.
Article
Recognition systems based on a combination of different experts have been widely investigated in the recent past. General criteria for improving the performance of such systems are based on estimating the reliability associated with the decision of each expert, so as to suitably weight its response in the combination phase. According to the methods proposed to-date, when the expert assigns a sample to a class, the reliability of such a decision is estimated on the basis of the recognition rate obtained by the expert on the chosen class during the training phase. As a consequence, the same reliability value is associated with every decision attributing a sample to a same class, even though it seems reasonable to take into account its dependence on the quality of the specific sample. We propose a method for estimating the reliability of each single recognition act of an expert on the basis of information directly derived from its output. In this way, the reliability value of a decision is more properly estimated, thus allowing a more precise weighting during the combination phase. The definition of the reliability parameters for widely used classification paradigms is discussed, together with the combining rules employing them for weighting the expert opinions. The results obtained by combining four experts in order to recognise handwritten numerals from a standard character database are presented. Comparison with classical combining rules is also reported, and the advantages of the proposed approach outlined.
Article
This paper presents a bibliography of nearly 2150 references related to computer vision and image analysis, arranged by subject matter. The topics covered include computational techniques; feature detection and segmentation; image and scene analysis; two-dimensional shape; pattern; color and texture; matching and stereo; three-dimensional recovery and analysis; three-dimensional shape; and motion. A few references are also given on related topics, including geometry and graphics, compression and processing, sensors and optics, visual perception, neural networks, artificial intelligence and pattern recognition, as well as on applications.
Article
Attributed relational graphs (ARGs) have shown superior qualities when used for image representation and analysis in computer vision systems. A new, efficient approach for calculating a global distance measure between attributed relational graphs is proposed, and its applications in computer vision are discussed. The distance measure is calculated by a global optimization algorithm that is shown to be very efficient for this problem. The approach shows good results for practical size ARGs. The technique is also suitable for parallel processing implementation.
Article
This paper presents a powerful image understanding system that utilizes a semantic-syntactic (or attributed-synibolic) representation scheme in the form of attributed relational graphs (ARG's) for comprehending the global information contents of images. Nodes in the ARG represent the global image features, while the relations between those features are represented by attributed branches between their corresponding nodes. The extraction of ARG representation from images is achieved by a multilayer graph transducer scheme. This scheme is basically a rule-based system that uses a combination of model-driven and data-driven concepts in performing a hierarchical symbolic mapping of the image information content from the spatial-domain representation into a global representation. Further analysis and inter-pretation of the imagery data is performed on the extracted ARG representation. A distance measure between images is defined in terms of the distance between their respective ARG representations. The distance between two ARG's and the inexact matching of their respective components are calculated by an efficient dynamic programming technique. The system handles noise, distortion, and ambiguity in real-world images by two means, namely, through modeling and embedding them into the transducer's mapping rules, as well as through the appropriate cost of error-transformation for the inexact matching of the ARG image representation. Two illustrative experiments are presented to demonstrate some capabilities of the proposed system. Experiment I deals with locating objects in multiobject scenes, while Experiment II is concerned with target detection in SAR images.
Article
We present a method for character recognition especially designed for the case in which the shapes of characters belonging to the same class vary greatly, as it happens with unconstrained hand-printed characters and omnifont printed characters. The most distinctive feature of the method is the use of a special kind of structural description of character shape in connection with a neural network classifier. An original technique is used to achieve the best trade-off between reject and misclassification rates. Experimental results on databases of both hand-printed and printed characters are illustrated.
Article
A method for the recognition of multifont printed characters is proposed, giving emphasis to the identification of structural descriptions of character shapes using prototypes. Noise and shape variations are modeled as series of transformations from groups of features in the data to features in each prototype. Thus, the method manages systematically the relative distortion between a candidate shape and its prototype, accomplishing robustness to noise with less than two prototypes per class, on average. The method uses a flexible matching between components and a flexible grouping of the individual components to be matched. A number of shape transformations are defined, including filling of gaps, so that the method handles broken characters. Also, a measure of the amount of distortion that these transformations cause is given. Classification of character shapes is defined as a minimization problem among the possible transformations that map an input shape into prototypical shapes. Some tests with hand-printed numerals confirmed the method's high robustness level
Article
A method for comparing polygons that is a metric, invariant under translation, rotation, and change of scale, reasonably easy to compute, and intuitive is presented. The method is based on the L <sub>2 </sub> distance between the turning functions of the two polygons. It works for both convex and nonconvex polygons and runs in time O ( mn log mn ), where m is the number of vertices in one polygon and n is the number of vertices in the other. Some examples showing that the method produces answers that are intuitively reasonable are presented