ArticlePDF Available

Multilevel Training of Binary Morphological Operators

May 2009
IEEE Transactions on Pattern Analysis and Machine Intelligence 31(4):707-20

May 2009
31(4):707-20

DOI:10.1109/TPAMI.2008.118

Source
PubMed

Authors:

Nina S. T. Hirata

University of São Paulo

The design of binary morphological operators that are translation-invariant and locally defined by a finite neighborhood window corresponds to the problem of designing Boolean functions. As in any supervised classification problem, morphological operators designed from training sample also suffer from overfitting. Large neighborhood tends to lead to performance degradation of the designed operator. This work proposes a multi-level design approach to deal with the issue of designing large neighborhood based operators. The main idea is inspired from stacked generalization (a multi-level classifier design approach) and consists in, at each training level, combining the outcomes of the previous level operators. The final operator is a multi-level operator that ultimately depends on a larger neighborhood than of the individual operators that have been combined. Experimental results show that two-level operators obtained by combining operators designed on subwindows of a large window consistently outperforms the single-level operators designed on the full window. They also show that iterating two-level operators is an effective multi-level approach to obtain better results.

Example of the specialization of the general schema of Fig. 5 to the iterative case. Only one operator per level.

…

Basic training architecture: one-level operator (left) and two-level operator, with n 1 level-1 operators Ψ (1) 1 ,. .. , Ψ (1) n1 (right).

…

Experiment D 2 : input image, detail of the su- perposition of the level-1 operator results (darkness of

…

Experiment A (circular element recognition from images scanned at 100dpi from a book): test, ideal, one-level operator, and two-level operator images, from top to bottom and left to right respectively.

…

Figures - uploaded by Nina S. T. Hirata

Content may be subject to copyright.

Content uploaded by Nina S. T. Hirata

Content may be subject to copyright.

MULTI-LEVEL TRAINING OF BINARY MORPHOLOGICAL OPERATORS 1

Multi-level Training of

Binary Morphological Operators

Nina S. T. Hirata

Abstract—The design of binary morphological operators that are translation-invariant and locally deﬁned by a ﬁnite neighborhood

window corresponds to the problem of designing Boolean functions. As in any supervised classiﬁcation problem, morphological

operators designed from training sample also suffer from overﬁtting. Large neighborhood tends to lead to performance degradation

of the designed operator. This work proposes a multi-level design approach to deal with the issue of designing large neighborhood

based operators. The main idea is inspired from stacked generalization (a multi-level classiﬁer design approach) and consists in, at

each training level, combining the outcomes of the previous level operators. The ﬁnal operator is a multi-level operator that ultimately

depends on a larger neighborhood than of the individual operators that have been combined. Experimental results show that two-

level operators obtained by combining operators designed on subwindows of a large window consistently outperforms the single-level

operators designed on the full window. They also show that iterating two-level operators is an effective multi-level approach to obtain

better results.

Index Terms—Image processing, pattern recognition, machine learning, classiﬁer design and evaluation, morphological operator,

Boolean function, image operator learning, multi-level training, stacked generalization.

1 INTRODUCTION

MOrphological operators are nonlinear signal and

image processing tools with applications in a vari-

ety of ﬁelds such as biological and biomedical image pro-

cessing, geoscience, remote sensing, industrial systems,

document processing, among others [1], [2], [3]. Many of

these operators such as erosions, dilations, openings and

closings, are parameterized by subsets called structuring

elements, used to locally probe the input images. The

output of the operator at each location depends on the

relationship between the structuring element and the

image [2], [3], [4], [5].

Morphological image operators are usually designed

on a trial and error basis by composing several sim-

pler operators, each one with an appropriate structur-

ing element. Design success depends on the expertise

of the designer. Another design approach consists of

procedures based on training techniques [6], [7], [8], [9],

[10], [11]. Pairs of training images, consisting of an image

before processing and its respective desired processing

result, are used to estimate the parameters of an operator.

Speciﬁcally, some representation for the operators is

assumed, and the training process adjusts the parameters

in order to ﬁnd, among the operators that comply with

•N. S. T. Hirata is with the Department of Computer Science, Institute of

Mathematics and Statistics, University of S˜ao Paulo, Brazil.

E-mail: nina@ime.usp.br

c

2007 IEEE. Personal use of this material is permitted.

Permission from IEEE must be obtained for all other uses, in any

current or future media, including reprinting/republishing this

material for advertising or promotional purposes, creating new

collective works, for resale or redistribution to servers or lists, or

reuse of any copyrighted component of this work in other works.

the adopted representation, one that optimizes some

performance criterion. Several optimization algorithms

are used, such as linear programming [12], [13], genetic

algorithms [7], [14], [15], decision trees [16], adaptive

algorithms [17], among others.

In this work, the representation considered is the

canonical decomposition. According to the canonical de-

composition theorem, any translation-invariant operator

can be expressed uniquely as a supremum of a set

of interval operators [18]. By imposing local deﬁnition

by a neighborhood (expressed in terms of a window

W), it can be shown that all structuring elements of

the interval operators of a translation-invariant image

operator are subsets of W[19], [20]. These operators

are called W-operators and are locally characterized,

that is, the output at any location is determined by

a function that depends solely on the pixel values at

the neighborhood W. Hence, the problem of designing

aW-operator can be seen as a problem of designing

simple pattern classiﬁers (classiﬁers that map patterns

within Winto an appropriate output pixel value). In

binary morphology, W-operators are equivalent to the

class of Boolean functions on |W|(cardinality of set W)

variables [21], [22].

By considering a sufﬁciently large neighborhood,

it is possible, at least theoretically, to represent any

translation-invariant operator via W-operators [22]. In an

ideal setting, a sufﬁciently large window (say as large as

the largest objects in the images) should be considered

and training techniques could be used to obtain an

operator from training data. However, it is a well known

fact that due to overﬁtting the performance of learning

algorithms degrades as the dimension of the patterns

increases [23], [24]. This situation in the context of binary

This is the accepted version. Published version DOI: 10.1109/TPAMI.2008.118

MULTI-LEVEL TRAINING OF BINARY MORPHOLOGICAL OPERATORS 2

0.01

0.015

0.02

0.025

0.03

0.035

20 40 60 80 100 120

MAE

Window size

5 training images

2 training images

1 training image

Fig. 1. The U-shape error curve indicates overﬁtting.

morphological image operator design is illustrated in

Fig. 1. Horizontal axis represents the window size (in

pixels) while the vertical one represents the mean abso-

lute error (MAE) computed on an independent test set.

Each curve corresponds to the mean error of ten training

runs, using different sets with the same number of

training images. The graph shows that, as the dimension

(window size) increases, the error initially drops very

fast, reaches a minimum point, and starts to increase

slowly. Such behavior, that results in a U-shaped error

curve, is irrespective to the amount of training data in

practical situations (ﬁnite sample). Standard deviation is

shown for the upper and lower curves.

Large error for small windows is due to their inability

to distinguish larger patterns while for the large win-

dows it is due to insufﬁciency of training data. Thus,

for a ﬁxed amount of training data, simply increasing

window size does not work as a way to decrease the

error. A possible solution could be increasing the amount

of training data until satisfactory error rate is reached.

However, a linear increase in the amount of training

data does not result in a linear increase of performance.

Moreover, training data may not be readily available and

may require signiﬁcant additional effort to be prepared

or even involve costs that can not be neglected. Another

issue to be considered is the fact that computational cost

tends to increase not only with window size, but also

with the amount of training data.

This work reports investigations concerned with ob-

taining, from a given ﬁxed amount of training data, a

morphological image operator with better performance

than the one that corresponds to the optimal window

(minimum point) in the U-shaped error curve. The study

is restricted to binary morphological image operators.

The training algorithm used is Boolean function mini-

mization, described in [25]. It should be noted that while

this work addresses the design of translation-invariant

morphological operators that are locally deﬁned within

a neighborhood window, most of the works cited above

are restricted to speciﬁc subclasses of these operators.

Attempts to deal with training data limitation include

use of prior knowledge such as knowledge on im-

age operator properties (for instance, anti-extensiveness

or increasingness [13]) and knowledge on the data

model [26], and iterative design techniques [9]. The use

of knowledge on properties of the desired operator may

lead however to overly complex training algorithms,

as is the case of stack ﬁlter design in which a large

number of constraints must be satisﬁed [12], [13]. On the

other hand, image model is seldom known and modeling

image data is not a trivial task. As for the iterative design

approach [9], which results in sequentially composed

operators, it does show improvement of the results with

relation to the single non-composed operators. In this

work a generalization of the iterative approach, namely

a multi-level training approach, inspired from classiﬁer

combination techniques [27] is proposed. More speciﬁ-

cally, the proposed model is based on stacked general-

ization, introduced by Wolpert [28]. The proposed model

is ﬂexible enough to accommodate several multi-level

operator composition architectures.

Some preliminary results related to this approach have

been presented in [29]. This paper formalizes the idea

proposing a training model and discusses the relation

of the proposed model to some previously known ap-

proaches. In addition, new application examples are

presented.

The paper is organized as follows. In Section 2, some

deﬁnitions and notations are introduced, and the rela-

tionship between W-operators and Boolean functions is

recalled. Since the interest is in designing morphological

operators, the notion of optimal mean absolute error

(MAE) W-operators and a basic training methodology

used to estimate optimal MAE morphological operators

from training data are described. In Section 3, a multi-

level training approach is introduced by ﬁrst considering

two-level training and then generalizing it to multiple

levels of training. Its relation to stacked generalization

as well as the fact that it generalizes the iterative design

approach are discussed in this section. In Section 4,

experimental settings and several results that show the

superiority of multi-level over a single-level operator

design, both in terms of error and computation time,

are presented. Experimental results show that the two-

level approach consistently gives better results than

single-level operators. This section also includes some

examples of multi-level training. Images from several

application examples are presented in the Appendix.

Section 5 presents some concluding remarks and future

research steps of this work.

2 BACKGROUND

This section initially recalls the equivalence between

translation-invariant binary morphological operators

and Boolean functions. This equivalence together with

some statistical assumptions allow us to model the prob-

lem of designing morphological operators as a prob-

lem of designing Boolean functions. A basic design

MULTI-LEVEL TRAINING OF BINARY MORPHOLOGICAL OPERATORS 3

procedure, for mean absolute error minimization, ﬁrst

described in [8] and improved in the subsequent years,

will be presented in order to provide a through overview

of the current state-of-the-art on this subject.

2.1 Binary W-operators

Let E=Z2. Binary images deﬁned on Eare mappings

of the form f:E→ {0,1}. A binary image deﬁned on E

can be equivalently represented by a subset S⊆E. The

collection of all binary images deﬁned on E(all subsets

of E) will be denoted by P(E). Binary images will be

represented by subsets S∈ P(E)as they usually are

treated in mathematical morphology. When convenient,

the functional notation S(x) = 1 to indicate that x

belongs to S(xis a foreground pixel) and S(x)=0

on the contrary (xis a background pixel) will be used.

Elements of Eare denoted by lowercase letters such as x

and z. The translation of an image S∈ P(E)by a vector

zis deﬁned by Sz={x+z|x∈S}where +is the usual

vector addition in E.

A binary image operator is a mapping of the form

Ψ : P(E)→ P(E).Ψis translation-invariant if, for any S∈

P(E)and z∈E,[Ψ(S)]z= Ψ(Sz).Ψis locally deﬁned with

respect to a non-empty window W⊆Eif z∈Ψ(S)⇐⇒

z∈Ψ(S∩Wz)for any S∈ P(E)and z∈E.

Operators Ψthat are both translation-invariant and

locally deﬁned are called W-operators and can be charac-

terized by a local function ψ:P(W)→ {0,1}as follows:

z∈Ψ(S)⇐⇒ ψ(S−z∩W)=1.(1)

By assigning a binary variable xito each point wi∈W,

and setting xi= 1 if and only if wi∈S−z∩W, function

ψcan be seen as a Boolean function (BF) on n=|W|

variables.

This establishes the relationship between binary mor-

phological operators and BFs and reduces the problem

of designing binary W-operators to the problem of de-

signing BFs.

2.2 Mean Absolute Error Optimality

Let (S,I)denote a random process of jointly stationary

image processes Sand I, with realizations (S, I)where

Scorresponds to an observed image (i.e., an image to

be processed) and Icorresponds to the respective ideal

outcome (i.e., the respective desired processing result).

The mean absolute error (MAE) of a W-operator Ψ,

characterized by a BF ψ, with respect to process (S,I),

is deﬁned as being the expected value of the absolute

difference between Ψ(S)and Iat an arbitrary location z,

i.e.,

MAEhΨi=E[|ψ(S−z∩W)−I(z)|].(2)

The process S−z∩Wis a random set, which can

be thought as a random vector Xzby associating each

element of S−z∩Wto a component of Xz. Similarly,

the value of a given pixel zin Ican be thought as a

realization of a random variable yz. Due to stationarity,

zmay be dropped from Xzand from yz, resulting in a

process (X,y)that locally characterizes (S,I). The joint

distribution of (X,y)will be denoted P(X,y).

With these assumptions, MAEhΨican be expressed as

MAEhΨi=E[|ψ(X)−y|],(3)

with respect to the joint distribution P(X,y).

It can be shown that the optimal MAE operator [30]

is the one characterized by the function deﬁned by, for

any X⊆W,

ψ(X) = 





1,if P(X, 1) > P (X, 0),

0,if P(X, 0) > P (X, 1),

0or 1if P(X, 0) = P(X, 1).

(4)

This equation deﬁnes a BF that can be straightforwardly

expressed in its canonical sum of products form.

2.3 Basic training method

Although the MAE optimal operator can be character-

ized in terms of input-output processes joint distribution,

such distribution is usually not known. Thus, a natural

solution is to estimate these probabilities from training

images and use in Eq. 4 the estimated probabilities

instead of the true ones.

Let (Si, Ii)denote pairs of training images and Wa

non-empty window. An example of a pair of training

images and a window are shown in Fig. 2. The training

(a)

x2x3x4

(b)

Fig. 2. Example of (a) a pair of training images (S, I), and

(b) a neighborhood window W. Image size is 64 ×64.xi,

i= 1,...,5, indicate the binary variables assigned to the

points of W.

methodology used in this work is composed of three

steps, described next. Each step is illustrated for the

example given in Fig. 2.

1) Slide Won each input image Siand record the

pair (Si−z∩W, Ii(z)) for each location z(except

those at which the window is not entirely inside

the image domain) and count he occurrence of each

pair (X, y)∈ P (W)× {0,1}. This step yields an

estimate ˆ

P(X, y)of P(X, y). For the above example,

3844 (one pixel wide margins in all sides of the

image are not considered) pairs of the form (Si−z∩

W, Ii(z)) are observed, resulting in the following

frequency table:

MULTI-LEVEL TRAINING OF BINARY MORPHOLOGICAL OPERATORS 4

3502 0

16 0

17 0

16 0

0 6

012

013

012

013

186 0

X000 111

y yy

2) For each pattern Xobserved in step 1, decide

ψ(X) = 1 if ˆ

P(X, y = 1) >ˆ

P(X, y = 0) and

ψ(X) = 0 otherwise. Were all possible patterns

have been observed, the BF corresponding to the

optimal MAE W-operator would be deﬁned. In

practice, not all patterns are observed.

For the above example, the elements for which

ψ(X) = 1 are

,,,,,, ,

{ }

and those for which ˆ

ψ(X) = 0 are

,,, , ,,,, ,

{ }

Other elements have not been observed in the

training image.

3) In order to obtain an operator that will be able

to classify patterns that have not been observed

during training, a generalization (or training) al-

gorithm must be applied. In this work, an algo-

rithm for the minimization of incompletely speci-

ﬁed BFs [25] is used. The minimization procedure

results in a BF whose values for the observed

patterns are exactly as deﬁned in step 2. The val-

ues for the non-observed patterns (don’t cares in

the terminology of switching functions [31]) are

determined by the minimization algorithm. Details

of the algorithm used in this work can be found

in [25].

After minimization, for the above example, the

resulting BF is ψ(x1, x2, x3, x4, x5) = x3x5+x3x4+

x3x2+x3x1. The product terms correspond to the

intervals below:

,,, ,

{ }

hhh i

Figure 3 shows the result obtained by applying the

trained operator on a test image.

From pattern recognition point of view, the design

process above can be seen as a procedure of determining

a binary classiﬁer where the patterns are those observed

through Win the input images and the corresponding

class is the pixel value (y∈ {0,1}) in the ideal output

image. Thus, although this work uses BF minimization

as a learning model, any other models such as neural

networks, support vector machines or others could be

used. Another comment is that step 2 of the procedure

described above could be simply ignored. By doing that,

Fig. 3. Application of the operator learned from examples

of Fig. 2: test image (left) and respective result (right).

one may have a set of examples with conﬂicting class

labels. So long as the learning algorithm is able to deal

with that, there is no further difﬁculties. The advantage

of using BF minimization is that the resulting product

terms correspond to interval operators and they can be

readily interpreted morphologically [21], [32].

Moreover, for the design of image operators, the space

of operators (class of classiﬁers) can be constrained based

on prior knowledge about properties of the desired

operator. For instance, in the case of binary operators,

a very useful constraint is to consider anti-extensive

operators, that is, operators such that Ψ(S)⊆S. This

property guarantees that the resulting image is a subset

of the input image. In this case, in step 1 of the procedure

described above, one only needs to slide the window

over the foreground pixels of the input image and, in

step 3, the minimization procedure can safely suppose

ψ(X) = 0 for all elements X∈[∅, W \ {o}]. As a

consequence, the number of patterns considered in the

minimization process as well as the overall processing

cost may be signiﬁcantly reduced. Another example are

the increasing operators, i.e., those such that S1⊆S2

implies Ψ(S1)⊆Ψ(S2). In this case, it can be shown

that the characteristic BFs are positive [30], [33].

As mentioned in the introduction, if we gradually in-

crease the window size, the operators designed on large

windows tend to present poorer performance in terms of

MAE than those designed on moderate size windows.

This is a common phenomenon known as overﬁtting

(also strongly related to the phenomenon known as

the curse of dimensionality). Therefore, designing im-

age operators based on the procedure described above

requires balancing window size and increase in MAE:

an arbitrarily large window results in larger MAE due

to generalization error while a too small window results

in larger MAE simply because no better MAE can be

achieved due to its inability to discriminate larger pat-

terns.

Finding the best window for a given ﬁxed amount

of training data is still an open problem. Some works

have proposed heuristics to treat this problem [34]. In

the pattern recognition ﬁeld, this corresponds to the

feature selection problem [35]. Even supposing we are

MULTI-LEVEL TRAINING OF BINARY MORPHOLOGICAL OPERATORS 5

able to compute the best window for a given amount

of training data, there are situations in which such a

window is not large enough to reach an acceptable

MAE. In these cases, in order to reduce MAE, the only

possibility is to consider a larger space of operators,

one in which operators are based on larger windows. If

the above procedure is used with larger windows, MAE

will be larger due to generalization error (since we are

considering that the amount of training data remains

the same). Thus, the main question now is whether it

is possible to, using the same amount of training data,

design an image operator with smaller empirical MAE.

3 MU LTI-LEVEL TRAINING OF MORPHOLOGI-

CAL OPE RATORS

The idea of multi-level training is inspired from clas-

siﬁer combination approaches. Distinct operators based

on different windows are designed and then another

operator that combines their outcomes is designed. The

main question is whether the combination is able to pro-

duce better results than any of the individual operators

and also of the operator trained with the window that

corresponds to the union of all individual windows.

3.1 The proposed model

To introduce notations, we start by explaining how a

two-level approach could be modeled. In the ﬁrst level,

n1operators, each one based on its own window Wi,

are trained. In the second level, outcomes of each of

the ﬁrst level operators are combined to train a level-2

operator. The resulting operator is a two-level operator.

An example is shown in Fig. 4. Three sub-windows W1,

W2and W3are considered, and three level-1 operators,

denoted respectively Ψ(1)

1,Ψ(1)

2, and Ψ(1)

3, are designed

based on each of the sub-windows. Realizations of the

output image process Ψ(1)

i(S),i= 1,2,3, are the images

that will be used for the training of the level-2 operator.

Notice that, in this particular case, the union of the three

sub-windows is equal to a larger window W. Therefore,

by combining the outputs of Ψ(1)

1,Ψ(1)

2, and Ψ(1)

3, the

level-2 operator Ψ(2) indirectly makes use of information

under Win the input image process S.

In the schema shown in Fig. 4, exactly one pixel

and the one at the same location of the target pixel is

taken from each level-1 operator output in the second

level training. However, in a more general setting, more

than one pixel of each of the level-1 outcomes could be

considered. Moreover, some pixels of the original input

Scould also be considered in the level-2 operators. These

considerations lead to a general multi-level training ap-

proach described next.

3.1.1 Model ﬂexibility

In general, input to an operator at any level may

come from the outputs of any previous levels, including

the initial input data. Therefore, we denote by nlthe

W1W2W3

Ψ(1)

1Ψ(1)

2Ψ(1)

Ψ(2)

Ψ(1)

i(S)

Ψ(2)(S)

Fig. 4. A two-level operator. Level-1 operators Ψ(1)

1,Ψ(1)

and Ψ(1)

3are based on windows W1,W2and W3, respec-

tively. Although depicted in separate, input image to all

three level-1 operators is the same; what differs is the

neighborhood taken as an input (shown in dark gray) by

each operator. Level-2 operator Ψ(2) takes as input three

pixels, one from each output Ψ(1)

i(S),i= 1,2,3(shown in

dark gray), of the results of the level-1 operators.

number of operators and by Ψ(l)

ithe i-th operator at

level l,i= 1,2, . . . , nl. The input of Ψ(l)

iis deﬁned

by a set of windows W(l)

i={W(l)

i(S(t)

j)|0≤t <

l, and, for each t, 1≤j≤nt}. A window W(l)

i(S(t)

indicates that a neighborhood of the j-th output of level

tis part of the input of Ψ(l)

i. Output of level 0as in

W(l)

i(S(0)

1)corresponds to the original input image.

Figure 5 shows a diagram representation of a general

three-level operator. At level 1, there are 3operators

Ψ(1)

1,Ψ(1)

2and Ψ(1)

3. They are based respectively on

windows W(1)

1(S(0)

1),W(1)

2(S(0)

1)and W(1)

3(S(0)

1). Win-

dow W(1)

1(S(0)

1)indicates that part of the input of Ψ(1)

comes from output 1of level 0(that means the original

input data S(0)

1). At the second level, there are two

operators, Ψ(2)

1and Ψ(2)

2. The former is assigned to four

windows, namely W(2)

1(S(0)

1),W(2)

1(S(1)

1),W(2)

1(S(1)

2)and

W(2)

1(S(1)

3). That means that part of the input data comes

from S(0)

1, other part from S(1)

1, other from S(1)

2and

other from S(1)

3. The input size of Ψ(2)

1is given by

|W(2)

1(S(0)

1)|+|W(2)

1(S(1)

1)|+|W(2)

1(S(1)

2)|+|W(2)

1(S(1)

3)|.

At the third level, parts of every output of the previous

levels plus the original input can be taken as input. The

window size of an operator Ψ(l)

iis given by

l−1

k=0

j=1

|W(l)

i(S(k)

j)|.(5)

From a practical point of view, most of the windows

involved in the schema should be empty or very small.

MULTI-LEVEL TRAINING OF BINARY MORPHOLOGICAL OPERATORS 6

S(0)

S(1)

1S(1)

2S(1)

S(2)

1S(2)

S(3)

Ψ(1)

1Ψ(1)

2Ψ(1)

Ψ(2)

1Ψ(2)

Ψ(3)

W(1)

1(S(0)

W(1)

2(S(0)

W(1)

3(S(0)

W(2)

1(S(0)

W(2)

2(S(0)

W(3)

1(S(0)

W(3)

1(S(1)

W(3)

1(S(1)

W(2)

1(S(1)

W(2)

2(S(1)

W(2)

1(S(1)

W(2)

2(S(1)

W(3)

1(S(1)

W(3)

1(S(2)

W(3)

1(S(2)

Fig. 5. A schema for three-level learning of image op-

erators: one input process, three level-1 operators, two

level-2 operators, and one level-3 operator. Output is the

process represented by S(3)

ψ(1)

ψ(2)

Ψ(1)

1(S)

Ψ(1)

2(S)

Ψ(1)

3(S)Output

Fig. 6. The result of the operator at any target pixel

depends on a neighborhood larger than the ones deﬁned

by individual windows.

Otherwise, a large pattern would be formed to train su-

perior level operators and that may result in overﬁtting.

3.1.2 Window support

Since multi-level operators are sequential compositions

of operators, they are based on windows usually larger

than the windows of their components. To understand

how large is the window of the ﬁnal operator, recall that

the Minkowski sum of two sets Aand Bis given by

A⊕B={x+y|x∈Aand y∈B}.(6)

To simplify notation, a one-dimensional domain ex-

ample shown in Fig. 6 is considered. Suppose the target

location is x=x0. There are three level-1 operators,

characterized respectively by BF ψ(1)

1,ψ(1)

2, and ψ(1)

Their windows, W(1)

1,W(1)

2, and W(1)

3are all 3-point,

with origin at the rightmost, center and leftmost pixels,

respectively. The second level operator Ψ(2) takes one

PSfrag

ψ(1)

ψ(2)

Ψ(1)

1(S)

Ψ(1)

2(S)

Ψ(1)

3(S)Output

xy1

Fig. 7. The result of the operator at any target pixel

depends on a neighborhood larger than the ones deﬁned

by individual windows.

pixel of each of the level-1 operator’s outcome. Thus,

we have Ψ=Ψ(2) (Ψ(1)

1,Ψ(1)

2,Ψ(1)

3), i.e.,

[Ψ(S)](x) =

=ψ(2)[Ψ(1)

1(S)](x),[Ψ(1)

2(S)](x),[Ψ(1)

3(S)](x)

=ψ(2)y1, y2, y3(7)

where y1=ψ(1)

1(S(x−2), S(x−1), S (x0)),

y2=ψ(1)

2(S(x−1), S(x0), S (x1)), and y3=

ψ(1)

3(S(x0), S(x1), S (x2)). Therefore, the result of the

ﬁnal operator, at location x, depends on the set of pixels

of Sat {x−2, x−1, x0, x1, x2}(pixel xplus two neighbor

pixels at both the left and the right side of x), which can

be expressed via Minkowski sum as ({x−2, x−1, x0} ⊕

{0})∪({x−1, x0, x1}⊕{0})∪({x0, x1, x2}⊕{0}). Thus,

although each level-1 operator depends on a 3-point

window, and the second operator depends on one pixel

of each level-1 operator’s outcomes, the composed ﬁnal

operator Ψdepends (indirectly) on a 5-point window.

When the level-2 operator considers pixels beyond

those at the target location, a larger window dependence

results. In Fig. 7, an example similar to the one in Fig. 6

is shown. The level-2 operator Ψ(2) takes as input the

pixels at x−1from S(1)

1, at x0from S(1)

2, and at x1from

S(1)

3. Dependence can be expressed as

[Ψ(S)](x) =

=ψ(2)[Ψ(1)

1(S)](x−1),[Ψ(1)

2(S)](x0),[Ψ(1)

3(S)](x1)

=ψ(2)y1, y2, y3(8)

where y1=ψ(1)

1(S(x−3), S(x−2), S (x−1)),

y2=ψ(1)

2(S(x−1), S(x0), S (x1)), and y3=

ψ(1)

3(S(x1), S(x2), S (x3)). In this case, the

dependence is given by a 7-point neighborhood

{x−3, x−2, x−1, x0, x1, x2, x3}, which can be expressed

as ({x−2, x−1, x0} ⊕ {−1})∪({x−1, x0, x1} ⊕ {0})∪

({x0, x1, x2}⊕{1}).

We deﬁne as a window support of an operator with

respect to an input process the input image pixel neigh-

borhood that is taken in consideration by the operator

(directly or indirectly) to produce the output pixel value

MULTI-LEVEL TRAINING OF BINARY MORPHOLOGICAL OPERATORS 7

at any arbitrary location. Supposing only one input

process S(0)

1, level -1 operators depend solely on initial

input data S(0)

1. Thus, the window support of any level-

1 operator Ψ(1)

iis the window W(1)

i=W(1)

i(S(0)

1). For

operators of level-2, input may come from outputs of

level-1 operators as well as from the initial input S(0)

Thus the window support of a level-2 operator Ψ(2)

iis

W(2)

i=∪n1

j=1 (W(1)

j⊕W(2)

i(S(1)

j)∪W(2)

i(S(0)

1).

In the following, we present a recurrent formula that

describes the window support W(l)

iof the i-th operator at

level l, with respect to input process S(0)

a,a= 1, . . . , n0:

W(1)

i=W(1)

i(S(0)

a), i = 1,2, . . . , n1

W(l)

i=W(l)

i(S(0)

a)∪

l−1

[

k=1

[

j=1 W(k)

j⊕W(l)

i(S(k)

j)i (9)

for any l≥1and 1≤i≤nl.

As an example, let us compute the window support

of the operator given by the architecture shown in

Fig 7. We know that W(1)

j=W(1)

j(S(0)

1). More precisely,

W(1)

1(S(0)

1) = {−2,−1,0}(origin at the right extreme),

W(1)

2(S(0)

1) = {−1,0,+1}(origin at the center) and

W(1)

3(S(0)

1) = {0,+1,+2}(origin at the left extreme).

Thus, since l= 2 and n1= 3,

W(2)

1=W(2)

1(S(0)

1)∪

[

k=1

[

j=1 W(k)

j⊕W(2)

i(S(k)

j)

=W(2)

1(S(0)

1)∪

[

j=1 W(1)

j⊕W(2)

1(S(1)

j)

=W(2)

1(S(0)

1)∪W(1)

1⊕W(2)

1(S(1)

1)∪

W(1)

2⊕W(2)

1(S(1)

2)∪W(1)

3⊕W(2)

1(S(1)

3)

=∅ ∪ ({−2,−1,0} ⊕ {−1})∪

({−1,0,+1}⊕{0})∪({0,+1,+2}⊕{+1})

={−3,−2,−1} ∪ {−1,0,+1}∪{+1,+2,+3}

={−3,−2,−1,0,+1,+2,+3}

which is the 7-point window centered at the origin.

3.2 Relation to Stacked Generalization

In the ﬁeld of machine learning, a multi-level training

approach known as stacked generalization was proposed

by Wolpert [28]. Multiple levels of training are per-

formed, such that at the initial level some classiﬁers

are obtained from training data and at other levels they

are obtained from the outputs of the classiﬁers of the

previous levels, eventually together with part of the orig-

inal input data. While usual approaches consider simple

combination strategies like majority vote, in stacked

generalization the proposal is to perform another level of

training, precisely to learn how to combine the outcomes

of the classiﬁers of the previous levels.

Despite of the similarities of the multi-level design

proposed here with stacked generalization, one differ-

ence should be pointed. Classiﬁers usually generate only

one output, namely the class to be assigned to the input

pattern. Therefore, when combining the outputs of the

previous levels to form the patterns for the current level,

at most one outcome from each classiﬁer of the previous

levels can be taken. With images, since the outcome

of a previous level operator is an image, information

from neighboring pixels can also be taken and thus

more richer combinations are possible (at the expense

of generating large dimension patterns).

3.3 Relation to iterative design

In iterative training [9], a sequence of operators that

aim to successively reﬁne the previous result are de-

signed as follows. Suppose the initial training data set

is given by pairs of images in the form (Si, Ii)(which

are realizations of random processes S(0) and I). In the

ﬁrst level of training, a W(1)-operator Ψ(1) is obtained

in such a way as to minimize E[Ψ(1)(S(0) ),I], the MAE

between the transformed image and its respective ideal

image. In the second level of training, pairs of the form

(Ψ(1)(Sj), Ij)are considered for training. After k-levels

of training, the ﬁnal operator consists of the composition

Ψ(S)=Ψ(k)(Ψ(k−1) (· · · (Ψ(2)(Ψ(1)(S))) · · · )).

This is a particular case of the proposed schema, with

only one operator per level. For three levels, the general

model shown in Fig. 5 reduces to the one shown in Fig. 8.

S(0) S(1) S(2) S(3)

Ψ(1) Ψ(2) Ψ(3)

W(1) W(2) W(3)

Fig. 8. Example of the specialization of the general

schema of Fig. 5 to the iterative case. Only one operator

per level.

Considering that

W(1) =W(1)

and since nl= 1 for all l, window support of any

operator Ψ(l)reduces to

W(l)=W(l−1) ⊕W(l)(S(l−1))(10)

Expanding it, we have that W(k)=W(1) ⊕W(2) ⊕

. . . ⊕W(k). Notice that the union Sl−1

k=1 in the origi-

nal recurrent formula becomes redundant here because

W(k)⊆W(k+1) for any k.

4 MO DEL EVALUATIO N

The model presented in the previous section is ﬂexible

enough to allow several variations in the training archi-

tecture, namely

•levels of training,

•number of classiﬁers in each level, and

MULTI-LEVEL TRAINING OF BINARY MORPHOLOGICAL OPERATORS 8

•inputs of each classiﬁer (window sizes and respec-

tive training images).

Some of the possibilities have been experimentally

evaluated and the main results are reported in this

section. In this work, the number of input process has

been ﬁxed to one (i.e., n0= 1).

4.1 Simple two-level operators

In order to check whether combination (i.e., two-level

operators) provides advantages over single-level oper-

ators, several experiments with distinct data sets have

been carried out. The training architectures used in these

experiments are shown in Fig. 9. For each experiment,

S(0)

S(1)

1S(1)

2S(1)

S(2)

S(1)

Ψ(1)

1Ψ(1)

2Ψ(1)

Ψ(2)

Ψ(1)

W(1)

W(2)

1(S(1)

W(2)

1(S(1)

W(2)

1(S(1)

n1)

Fig. 9. Basic training architecture: one-level operator

(left) and two-level operator, with n1level-1 operators

Ψ(1)

1,...,Ψ(1)

n1(right).

a window Wand n1subwindows W(1)

1, . . . , W (1)

n1of W

has been selected. A window Wknown from previous

experiences to be one of that yields good results for

the single level operator has been chosen whenever

such information were available. A single-level operator

has been designed with respect to W. The n1level-1

operators of the two-level operator have been designed

with respect to subwindows W(1)

iand then combined

by the level-2 operator by taking one pixel from each of

the outcomes of level-1 operators (i.e., W(2)

1(S(1)

i) = {o},

i= 1, . . . , n1).

For each data set, a ﬁxed number of pairs of training

images and an independent set of test images have been

considered. For the training of single-level operators,

all training images were used, while for the training of

the two-level operators one part has been used for the

level-1 operators and the remainder for the second-level

operators. The same test images have been used for both

cases.

Table 1 describes the data sets used in these exper-

iments. Its ﬁrst column presents a brief description of

the data set, in terms of the processing task. The second

column presents the total number of training images

followed by how they were distributed in the two-levels

of training. For instance, in the ﬁrst row, 8(5 : 3) means

that a total of 8images were used for the training of

the one-level operator while for the two-level operator

5images were used in the level-1 and 3in the second-

level training. The third column of the table indicates

how many images were used for testing. Images used

are not necessarily of the same size, but all images in a

data set has been obtained from a common context using

a common acquisition procedure (scanning parameters,

thresholding parameters, etc).

TABLE 1

Data sets used for training and estimation of MAE of

two-level operators.

Description # Training # Test

images images

A. Functional diagrams 8 (5:3) 10

(circular object segmentation)

A0. Functional diagrams 8 (5:3) 10

(dashed box segmentation)

A00. Functional diagrams 8 (5:3) 10

(character segmentation)

B. Texture segmentation 3 (2:1) 2

C. Character segmentation 10 (6:4) 10

D. Text segmentation 5 (3:2) 5

(magazine pages)

E. Text segmentation (book pages) 5 (3:2) 5

F. Boolean noise ﬁltering 5 (3:2) 5

A summary of the results obtained for the different

data sets is presented in Table 2. Each experiment is

identiﬁed according to the respective data set used. For

instance, Experiments C1and C2refer to data set C,

and indices 1and 2indicate experiments with distinct

window W. Column |W|indicates the size of W, while

column n1indicates the number of level-1 operators

used in the two-level operator. Training time is given

in seconds, MAE corresponds to the empirical MAE on

the test images (relative number of pixels in the absolute

difference between the operator result and the expected

ideal image, averaged on the total of test images). Values

missing in the training time and MAE ﬁelds of the one-

level operators indicate that their training time have

far exceeded the training time of the corresponding

two-level operators when their execution were aborted.

Subwindows used in each of the experiments are shown

in Fig. 10.

With the exception of Experiment D1, for the cases

in which both single and two-level operators have been

designed, the latter present better performance both in

terms of MAE and training time. Experiment D1com-

pared to Experiment D2indicates that, if the subwin-

dows are not large enough, two-level operators do not

have better performance than single-level ones in terms

of MAE. By using larger subwindows (Experiment D2),

two-level operator with better MAE is obtained, while

training time for the single operator for the correspond-

ing support window showed to be prohibitively large.

MULTI-LEVEL TRAINING OF BINARY MORPHOLOGICAL OPERATORS 9

TABLE 2

Training time (in seconds) and empirical MAE for different experiments (MAE averaged on the total of test images).

Training time based on a CPU AMD Athlon 64 X2 4200 2.2GHz, with 3GB RAM.

One-level Two-level

Exp. |W|n1#train. pixels #test pixels Train. time MAE Train. time MAE

A 9 ×96 85967 102883 2701 0.015 1288 0.008

A025 ×25 6 85940 102883 - - 420 0.025

A00 9×96 85967 102883 12615 0.06 6883 0.04

B 9 ×98 230533 104399 677476 0.07 169 0.04

C19×75 193445 197458 207 0.009 195 0.006

C211 ×95 193319 197474 33190 0.010 839 0.004

D19×75 1049540 783834 45518 0.040 20888 0.046

D211 ×11 7 1047219 783834 - - 62118 0.031

E 11 ×11 7 176368 493755 - - 48760 0.004

F 9 ×95 1270080 1260020 9828 0.006 662 0.003

A,A00 C1,D1

A0C2

D2,E

B F

Fig. 10. Subwindows (shown in black circles) used in the level-1 operators in experiments described in Table 2. Square

at the center indicates the origin.

Some test images and respective one-level and two-

level operator results are shown in the appendix.

To understand how level-2 operators combine their

input data, experiment D2has been examined. The

corresponding BF is composed of sixteen product terms

(intervals) in the form [A, 1111111],A∈ {0,1}7, where

each of the seven components comes from the output

of the level-1 operators. The extremities Aare: 1111100,

1111001, 1110101, 1101101, 1101101, 0111101, 1110010,

1100011, 1010110, 1001011, 0111010, 0110110, 0101110,

0101011, 0011110, 0010111. Careful analysis show that,

a pixel to be classiﬁed as 1in the output must receive

at least four votes from level-1 operators. However this

condition is not sufﬁcient. There are cases in which ﬁve

votes are necessary. A curious fact is that the second

operator in the ﬁrst level seems to be a key element in

determining when four or ﬁve votes are necessary. The

effect of the level-1 operators can be seen in Fig. 11.

4.2 Other architectures

According to the model, to combine previous level

operators, more than one pixel values from each of

the previous level results can be taken. The examples

presented above may give a false impression that taking

only one pixel from each image is the way combination

is done. The following example shows a situation in

which taking two pixels instead only one from each

input image results in better performance.

Experiment Bhas been repeated taking two pixels,

instead only one, from each output of the eight level-

1 operators. The two-point windows W(2)

1(S(1)

i),i=

1,2,...,8, are shown in Fig. 12. This experiment was

repeated nine times, considering different partitions of

the ﬁve images into subsets of three training images

and two test images. Training considering one pixel

from each input image in the second level resulted in

operators with performance 30% worse in average than

MULTI-LEVEL TRAINING OF BINARY MORPHOLOGICAL OPERATORS 10

Fig. 11. Experiment D2: input image, detail of the su-

perposition of the level-1 operator results (darkness of

the pixel indicates the number of operators that assigned

output 1to that pixel), and resulting output images, re-

spectively.

Fig. 12. Training architecture variation: windows used

in the second level to combine the eight level-1 operator

outputs.

the corresponding operators that considered two pixels

from each input image. The smallest difference against

the one pixel operator was 7.2% while the largest reached

118%.

Another variation that has been tested is iterative

training as described in Section 3.3. In general, ﬁrst

iteration produces a signiﬁcant MAE drop and then

decrease in MAE tend to diminish after the second

iteration, starting to oscillate or even increase slightly

as the number of iteration increases. Figure 13 shows an

example of a sequence of images obtained by increasing

the number of iterations, using windows of size 9×9,

5×9,5×3and 3×3, respectively. It can be seen that

at each iteration the resulting image approximates the

ideal one. In general, the quality of convergence depends

on the amount of training data and the sequence of

windows used. See more details, for example, in [9]. A

typical MAE curve as the number of iterations increases

is shown in Fig. 14. Bold line is the average MAE and

light gray lines correspond to the MAE with respect to

10 test images for the above example.

As another way to explore the possibility of variations

in the training architecture provided by the model, rather

than simply iterating operators sequentially, it is possible

to iterate two-level operators. A concrete example of

such an architecture is shown in Fig. 15. This is a four-

level operator that can also be understood as a two-level

iteration of two-level operators. The window support

Fig. 13. From top to bottom and left to right: input and

ideal image, followed by the outputs of four iterations with

windows of size 9×9,5×9,5×3and 3×3, respectively.

1234

0.005 0.010 0.015 0.020 0.025

Iteration against MAE −− 10 test images

Iteration

MAE

Fig. 14. MAE evolution through iterations: average MAE

(in bold) and MAE with respect to 10 test images for the

example illustrated in Fig. 13.

of level-4 operator Ψ(4) is (9 ×9) ⊕(9 ×9). In order

to compare its performance with equivalent window

support operator, iterative design with two-levels, both

levels of iteration on window 9×9has been done.

Table 3 shows the performance of these operators. This

experiment was repeated three times, using different

distribution of the training data between the levels. In

the three cases, the result obtained was consistent with

the one presented in Table 3, that is, iteration of two-level

operators performed better than iteration of single-level

operators.

5 CONCLUDING REMARKS

A model for multi-level training of large neighborhood

window based morphological operators has been pro-

MULTI-LEVEL TRAINING OF BINARY MORPHOLOGICAL OPERATORS 11

S(0)

S(1)

1S(1)

2S(1)

3S(1)

4S(1)

5S(1)

S(2)

S(3)

1S(3)

2S(3)

3S(3)

4S(3)

S(4)

Ψ(1)

1Ψ(1)

2Ψ(1)

3Ψ(1)

4Ψ(1)

5Ψ(1)

Ψ(2)

Ψ(3)

1Ψ(3)

2Ψ(3)

3Ψ(3)

4Ψ(3)

Ψ(5)

Fig. 15. Four-level architecture: n1= 6,n2= 1,n3=

5, and n4= 1. Windows of the level-1 operators are the

same of Experiment A; level-2 operator takes one pixel

from each of the level-1 outcomes; windows of the level-3

operators are rectangles of size 5×9or 9×5within the

9×9; level-4 operator takes one pixel from each of the

level-3 outcomes.

TABLE 3

Performance of a four-level (Ψ) and of a two-iteration

(Φ=Φ(2) Φ(1)) operators, both with window support

17 ×17 on data set A(8 training and 10 test images).

Operator Error pixels

Φ(1) 1760

Φ=Φ(2) Φ(1) 628

Ψ(1)

i, i = 1,2,...,6 2110 ±110

Ψ(2) 677

Ψ(3)

i, i = 1,2,...,5 409 ±30

Ψ(4) 294

posed. Experimental results show that two-level opera-

tors consistently outperforms single-level operators both

in terms of MAE and processing time. They also show

that multi-level training, by iterating two-level operators,

is an effective way of obtaining better results than usual

iterative design techniques.

In order to understand why combining several smaller

window operators results in better performance than just

training an operator on a large window, it is convenient

to look back to the U-shaped error curve presented in

the introduction of this work. According to the typ-

ical behavior, those error curves present a relatively

ﬂat minimum region, corresponding to the windows of

operators with best error performance in test images. If

one uses these windows for the level-1 operators, then

performance similar to the best obtained by a single-level

operator is guaranteed because one could just choose

the level-1 operator with best performance as the output

of the level-2 operator. Thus, it is reasonable to expect

that, instead of just choosing the output of one of the

level-1 operators, if one decides the output based on a

second level of training (from the outputs of the level-1

operators), results should be no worse.

In all experiments reported in this work, a subset of

windows for the level-1 operators that resulted in two-

level operators with improved performance were found

without great effort. The effectiveness of two-level and

iterated two-level operators shows the usefulness of the

proposed model and justiﬁes further investigations with

respect to issues related to the choice of the training

architecture, namely the choices concerning the number

of levels of training, number of operators in each level

and their respective windows and training data. With

regard to the choice of windows, so far experimental

results show that taking just one pixel value of each of

the previous level operator results is often a good choice.

Although there exists cases in which taking more pixel

values is better, too many pixels should not be consid-

ered, mainly when the amount of training data is limited,

because that may lead to overﬁtting. In practice, there

seems to be a tradeoff between the number of pixels

to be taken from each of the outputs of the previous

levels and the number of operators to be combined.

With respect to the number of levels, in the iterative

design approach one should stop iterating as soon as

the error stops decreasing. However, that may depend

on the windows used in each iteration. In the current

state, fully or partially automating choices concerning

all these issues may be considered a real challenge.

Another issue to be investigated is if additional input

processes that do not necessarily reﬂect geometry and

topology (shape features that are captured by windows)

but other features like color, texture, or even geometri-

cal and topological features not easily captured by the

windows (such as area, size, presence or absence of

holes in the component, etc) could be able to improve

performance of the multi-level operator.

The proposed model may be used to process high-

resolution images by considering windows that have the

effect of sub-sampling the images in different ways. It is

one of our next aims to relate the proposed approach to

the multi-resolution design of morphological operators.

Extension of the proposed approach for the design of

gray-scale morphological operators is also possible. In

the gray-scale case the effects of overﬁtting are more crit-

ical than in the binary case. Thus, it is expected that good

performance improvement would be achieved by multi-

level operators. However, designing gray-scale operators

is computationally hard and demands a larger amount

of training data. Thus, before tackling multi-level design,

single-level design needs to be better experimented.

Finally, from a more formal point of view, this ap-

proach and others that consider combination of oper-

ators may be framed in the context of function decom-

position. Given a discrete function, decomposing it as

MULTI-LEVEL TRAINING OF BINARY MORPHOLOGICAL OPERATORS 12

a composition of functions that depend on a smaller

number of inputs each is a classical problem. In the

context of morphological operator design, an interesting

question is to ﬁnd out which are the operators that can be

obtained by a given training architecture or, conversely,

given a class of image operators to ﬁnd out if there is a

multi-level training architecture that corresponds to that

class.

APPENDIX

RESULTS FOR TEST IMAGES

Test images and respective results obtained by the de-

signed operators are presented for some of the experi-

ments described in Section 4 (Figures 16 to 21.) These

and additional images can be ﬁnd in the web site

http://www.vision.ime.usp.br/nonlinear/multilevel.

ACK NOWLEDGMENTS

This work has been supported by FAPESP through pro-

cess 2004/11586-7. N. S. T. Hirata is partially supported

by CNPq, Brazil, under grant 312482/2006-0.

REFERENCES

[1] G. Matheron, Random Sets and Integral Geometry. John Wiley, 1975.

[2] J. Serra, Image Analysis and Mathematical Morphology. Academic

Press, 1982.

[3] P. Soille, Morphological Image Analysis, 2nd ed. Berlin: Springer-

Verlag, 2003.

[4] R. M. Haralick, S. R. Sternberg, and X. Zhuang, “Image Analysis

Using Mathematical Morphology,” IEEE Transactions on Pattern

Analysis and Machine Intelligence, vol. PAMI-9, no. 4, pp. 532–550,

July 1987.

[5] E. R. Dougherty and R. A. Lotufo, Hands-on Morphological Image

Processing. SPIE Press, 2003.

[6] I. T˘

abus¸, D. Petrescu, and M. Gabbouj, “A training Framework

for Stack and Boolean Filtering – Fast Optimal Design Procedures

and Robustness Case Study,” IEEE Transactions on Image Processing,

vol. 5, no. 6, pp. 809–826, June 1996.

[7] N. R. Harvey and S. Marshall, “The Use of Genetic Algorithms in

Morphological Filter Design,” Signal Processing: Image Communica-

tion, vol. 8, no. 1, pp. 55–71, January 1996.

[8] J. Barrera, E. R. Dougherty, and N. S. Tomita, “Automatic Program-

ming of Binary Morphological Machines by Design of Statistically

Optimal Operators in the Context of Computational Learning

Theory,” Electronic Imaging, vol. 6, no. 1, pp. 54–67, January 1997.

[9] N. S. T. Hirata, E. R. Dougherty, and J. Barrera, “Iterative Design

of Morphological Binary Image Operators,” Optical Engineering,

vol. 39, no. 12, pp. 3106–3123, December 2000.

[10] R. Hirata Jr., M. Brun, J. Barrera, and E. R. Dougherty, “Mul-

tiresolution Design of Aperture Operators,” Journal of Mathematical

Imaging and Vision, vol. 6, no. 3, pp. 199–222, 2002.

[11] J. Yoo, K. L. Fong, J.-J. Huang, E. J. Coyle, and G. B. Adams III,

“A Fast Algorithm for Designing Stack Filters,” IEEE Transactions

on Image Processing, vol. 8, no. 8, pp. 1014–1028, August 1999.

[12] E. J. Coyle and J.-H. Lin, “Stack Filters and the Mean Absolute

Error Criterion,” IEEE Transactions on Acoustics, Speech and Signal

Processing, vol. 36, no. 8, pp. 1244–1254, August 1988.

[13] D. Dellamonica Jr., P. J. S. Silva, C. Humes Jr., N. S. T. Hirata,

and J. Barrera, “An Exact Algorithm for Optimal MAE Stack Filter

Design,” IEEE Transactions on Image Processing, vol. 16, no. 2, pp.

453–462, 2007.

[14] I. Yoda, K. Yamamoto, and H. Yamada, “Automatic Acquisition

of Hierarchical Mathematical Morphology Procedures by Genetic

Algorithms,” Image and Vision Computing, vol. 17, no. 10, pp. 749–

760, August 1999.

[15] M. I. Quintana, R. Poli, and E. Claridge, “Morphological al-

gorithm design for binary images using genetic programming,”

Genetic Programming and Evolvable Machines, vol. 7, no. 1, pp. 81–

102, 2006.

[16] R. Hirata Jr., E. R. Dougherty, and J. Barrera, “Aperture Filters,”

Signal Processing, vol. 80, no. 4, pp. 697–721, April 2000.

[17] P. Salembier, “Structuring element adaptation for morphological

ﬁlters,” Visual Communication and Image Representation, vol. 3, no. 2,

pp. 115–136, 1992.

[18] G. J. F. Banon and J. Barrera, “Decomposition of Mappings

between Complete Lattices by Mathematical Morphology, Part I.

General Lattices,” Signal Processing, vol. 30, pp. 299–327, 1993.

[19] H. J. A. M. Heijmans, Morphological Image Operators. Boston:

Academic Press, 1994.

[20] J. Barrera, R. Terada, R. Hirata Jr, and N. S. T. Hirata, “Automatic

Programming of Morphological Machines by PAC Learning,” Fun-

damenta Informaticae, vol. 41, no. 1-2, pp. 229–258, January 2000.

[21] G. J. F. Banon and J. Barrera, “Minimal Representations for

Translation-Invariant Set Mappings by Mathematical Morphol-

ogy,” SIAM J. Applied Mathematics, vol. 51, no. 6, pp. 1782–1798,

December 1991.

[22] J. Barrera and G. P. Salas, “Set Operations on Closed Intervals and

Their Applications to the Automatic Programming of Morpholog-

ical Machines,” Electronic Imaging, vol. 5, no. 3, pp. 335–352, July

1996.

[23] T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical

Learning. Springer-Verlag, 2001.

[24] T. M. Mitchell, Machine Learning, ser. McGraw-Hill Series in

Computer Science. McGraw-Hill, 1997.

[25] N. S. T. Hirata, J. Barrera, R. Terada, and E. R. Dougherty, “The

Incremental Splitting of Intervals Algorithm for the Design of

Binary Image Operators,” in Proceedings of the 6th International

Symposium: ISMM 2002, H. Talbot and R. Beare, Eds., 2002, pp.

219–228.

[26] E. R. Dougherty and J. Barrera, “Prior information in the design of

optimal binary ﬁlters,” in International Symposium on Mathematical

Morphology, ser. Mathematical Morphology and its applications to

Image and Signal Processing, 1998, pp. 259–266.

[27] L. I. Kuncheva, Combining Pattern Classiﬁers: Methods and Algo-

rithms. Wiley, 2004.

[28] D. H. Wolpert, “Stacked generalization,” Neural Networks, vol. 5,

pp. 241–259, 1992.

[29] N. S. T. Hirata, “Binary image operator design based on stacked

generalization,” in Proceedings of the SIBGRAPI 2005, A. C. Frery

and M. A. F. Rodrigues, Eds., 2005, pp. 63–70.

[30] E. R. Dougherty, “Optimal Mean-Square N-Observation Digital

Morphological Filters I. Optimal Binary Filters,” CVGIP: Image

Understanding, vol. 55, no. 1, pp. 36–54, January 1992.

[31] F. J. Hill and G. R. Peterson, Computer Aided Logical Design with

Emphasis on VLSI, 4th ed. John Wiley & Sons, 1993.

[32] E. R. Dougherty and J. Barrera, “Logical Image Operators,” in

Nonlinear Filters for Image Processing, E. R. Dougherty and J. T.

Astola, Eds. Bellingham: SPIE and IEEE Press, 1999, pp. 1–60.

[33] N. S. T. Hirata, E. R. Dougherty, and J. Barrera, “A Switching

Algorithm for Design of Optimal Increasing Binary Filters Over

Large Windows,” Pattern Recognition, vol. 33, no. 6, pp. 1059–1081,

June 2000.

[34] D. C. Martins Jr., R. M. Cesar Jr., and J. Barrera, “W-operator

window design by minimization of mean conditional entropy,”

Pattern Analysis and Applications, vol. 9, pp. 139–153, 2006.

[35] A. Jain and D. Zongker, “Feature Selection: Evaluation, Ap-

plication, and Small Sample Performance,” IEEE Transactions on

Pattern Analysis and Machine Intelligence, vol. 19, no. 2, pp. 153–158,

February 1997.

MULTI-LEVEL TRAINING OF BINARY MORPHOLOGICAL OPERATORS 13

Fig. 16. Experiment A(circular element recognition from images scanned at 100dpi from a book): test, ideal, one-level

operator, and two-level operator images, from top to bottom and left to right respectively.

Fig. 17. Experiment B(Map region extraction): test and two-level operator images, respectively.

MULTI-LEVEL TRAINING OF BINARY MORPHOLOGICAL OPERATORS 14

Fig. 18. Experiment C(character recognition): test, C1one-level operator, C1two-level operator, and C2two-level

operator images, from top to bottom and left to right respectively.

PLACE

PHOTO

HERE

Nina S. T. Hirata received the PhD degree in

Computer Science from the University of S˜

Paulo, Brazil, in 2000. She is a professor of com-

puter science at the same university. Her cur-

rent research interests include nonlinear image

processing, machine learning applied to image

operator design, multiple classiﬁer systems, in-

teractive image segmentation, and handwritten

recognition.

MULTI-LEVEL TRAINING OF BINARY MORPHOLOGICAL OPERATORS 15

Fig. 19. Experiment D(magazine page text segmentation): test, D1one-level operator, D1two-level operator, and D2

two-level operator images, from top to bottom and left to right respectively.

MULTI-LEVEL TRAINING OF BINARY MORPHOLOGICAL OPERATORS 16

Fig. 20. Experiment E(book page text segmentation): test and two-level operator images, respectively.

Fig. 21. Experiment F(Boolean noise ﬁltering): test, ideal, one-level operator and two-level operator images, from

top to bottom and left to right respectively. The images consist of simulated Boolean squares with Boolean noise,

both uniformly distributed in the image domain. Squares have size that follows a normal distribution, while noise are

subsets of the 3×3square with size varying uniformly from 2 to 5.

From Mathematical Morphology to machine learning of image operators

Article

Apr 2022

Morphological image operators are a class of non-linear image mappings studied in Mathematical Morphology. Many significant theoretical results regarding the characterization of families of image operators, their properties, and representations are derived from lattice theory, the underlying foundation of Mathematical Morphology. A fundamental representation result is a pair of canonical decompositions for any translation-invariant operator as a union of sup-generating or an intersection of inf-generating operators, which in turn can be written in terms of two basic operators, erosions and dilations. Thus, in practice, a toolbox with functional operators can be built by composing erosions and dilations, and then operators of the toolbox can be further combined to solve image processing problems. However, designing image operators by hand may become a daunting task for complex image processing tasks, and this motivated the development of machine learning based approaches. This paper reviews the main contributions around this theme made by the authors and their collaborators over the years. The review covers the relevant theoretical elements, particularly the canonical decomposition theorem, a formulation of the learning problem, some methods to solve it, and algorithms for finding computationally efficient representations. More recent contributions included in this review are related to families of operators (hypothesis spaces) organized as lattice structures where a suitable subfamily of operators is searched through the minimization of U-curve cost functions. A brief account of the connections between morphological image operator learning and deep learning is also included.

On Machine-Learning Morphological Image Operators

Article

Full-text available

Aug 2021

Morphological operators are nonlinear transformations commonly used in image processing. Their theoretical foundation is based on lattice theory, and it is a well-known result that a large class of image operators can be expressed in terms of two basic ones, the erosions and the dilations. In practice, useful operators can be built by combining these two operators, and the new operators can be further combined to implement more complex transformations. The possibility of implementing a compact combination that performs a complex transformation of images is particularly appealing in resource-constrained hardware scenarios. However, finding a proper combination may require a considerable trial-and-error effort. This difficulty has motivated the development of machine-learning-based approaches for designing morphological image operators. In this work, we present an overview of this topic, divided in three parts. First, we review and discuss the representation structure of morphological image operators. Then we address the problem of learning morphological image operators from data, and how representation manifests in the formulation of this problem as well as in the learned operators. In the last part we focus on recent morphological image operator learning methods that take advantage of deep-learning frameworks. We close with discussions and a list of prospective future research directions.

Image Operator Learning Coupled with CNN Classification and Its Application to Staff Line Removal

Conference Paper

Nov 2017

Image operator learning coupled with CNN classification and its application to staff line removal

Article

Full-text available

Sep 2017

Many image transformations can be modeled by image operators that are characterized by pixel-wise local functions defined on a finite support window. In image operator learning, these functions are estimated from training data using machine learning techniques. Input size is usually a critical issue when using learning algorithms, and it limits the size of practicable windows. We propose the use of convolutional neural networks (CNNs) to overcome this limitation. The problem of removing staff-lines in music score images is chosen to evaluate the effects of window and convolutional mask sizes on the learned image operator performance. Results show that the CNN based solution outperforms previous ones obtained using conventional learning algorithms or heuristic algorithms, indicating the potential of CNNs as base classifiers in image operator learning. The implementations will be made available on the TRIOSlib project site.

A multi-layer image operator learning based on sample structure for staff lines removal

Article

Full-text available

Jul 2022
APPL INTELL

The removal of staff lines is the most significant step to separate notes from the score images in optical music recognition (OMR). However, musical images are often affected by different deformations, and it is difficult to delete the staff lines completely without affecting the integrity of the notes. A novel multi-layer image operator learning algorithm based on sample structure is proposed in this paper to solve the problem of staff lines deletion. Our algorithm is dedicated to obtain the structural characteristics of staff lines via image operator learning. Firstly, an iterative strategy is proposed to update the distribution of the samples for learning multiple image operators with different sample structure features. Further, based on the learned image operators, a multi-layer image operators network is designed to obtain the optimal combination of multiple operators. Finally, we have verified the feasibility of our algorithm on the data set 2013 ICDAR/GREC staff lines removal competition. The experiment shows that the proposed algorithm is robust against many kinds of deformation. Moreover, our algorithm is more competitive by comparing with state-of-the-art algorithms.

Discrete Morphological Neural Networks

Preprint

Full-text available

Sep 2023

A classical approach to designing binary image operators is Mathematical Morphology (MM). We propose the Discrete Morphological Neural Networks (DMNN) for binary image analysis to represent W-operators and estimate them via machine learning. A DMNN architecture, which is represented by a Morphological Computational Graph, is designed as in the classical heuristic design of morphological operators, in which the designer should combine a set of MM operators and Boolean operations based on prior information and theoretical knowledge. Then, once the architecture is fixed, instead of adjusting its parameters (i.e., structural elements or maximal intervals) by hand, we propose a lattice gradient descent algorithm (LGDA) to train these parameters based on a sample of input and output images under the usual machine learning approach. We also propose a stochastic version of the LGDA that is more efficient, is scalable and can obtain small error in practical problems. The class represented by a DMNN can be quite general or specialized according to expected properties of the target operator, i.e., prior information, and the semantic expressed by algebraic properties of classes of operators is a differential relative to other methods. The main contribution of this paper is the merger of the two main paradigms for designing morphological operators: classical heuristic design and automatic design via machine learning. Thus, conciliating classical heuristic morphological operator design with machine learning. We apply the DMNN to recognize the boundary of digits with noise, and we discuss many topics for future research.

Music Staff Elimination using Supervised Pixel Classification

Conference Paper

Dec 2020

This paper presents an approach for music staff elimination from an image containing music score without disturbing any symbol information. This is one of the most important tracks which improve the realization in optical music recognition system. Depending on intrinsic of music scores staff elimination was performed in literature based on image processing techniques. Here, a problem is modeled as a supervised pixel learning classification task for the elimination of staff lines. In this scenario, front pixel is tagged as symbol or staff. To train the classification algorithms, a pairs of scores with and without stafflines are used. Our proposed methodology is tested with other well-know classification techniques. Moreover, in our experiment certain parameters are set to default setting which is provided by libraries, an attempt of tuning the classification algorithm is not permitted. Our ultimate aim represents that even though by applying the straightforward method, still our method gives competitive results by using significant algorithms. Some of the advantages of using this method over the earlier methods are its high versatility.

A music symbols recognition method using pattern matching along with integrated projection and morphological operation techniques

Article

Full-text available

Jul 2018
MULTIMED TOOLS APPL

Optical Music Recognition (OMR) can be divided into three main phases: (i) staff line detection and removal. The goal of this phase is to detect and to remove staff lines from sheet music images. (ii) music symbol detection and segmentation. The propose of this phase is to detect the remaining musical symbols such as single symbols and group symbols, then segment the group symbols to single or primitive symbols after removing staff lines. (iii) musical symbols recognition. In this phase, recognition of musical symbols is the main objective. The method presented in this paper, covers all three phases. One advantage of the first phase of the proposed method is that it is robust to staff lines rotation and staff lines which have curvature in sheet music images. Moreover, the staff lines are removed accurately and quickly and also fewer details of the musical symbols are omitted. The proposed method in the first phase focuses on the hand-written documents databases which have been introduced in the CVC-MUSCIMA and ICDAR 2013. It has the lowest error rate among well-known methods and outperforms the state of the art in CVC-MUSCIMA database. In ICDAR 2013, the specificity measure of this method is 99.71% which is the highest specificity among available methods. Also, in terms of accuracy, recall rate and f-measure is only slightly less than the best method. Therefor our method is comparable favorably to the existing methods. In the second phase, the symbols are divided into two categories, single and group. In the recognition phase, we use a pattern matching method to identify single symbols. For recognizing group symbols, a hierarchical method is proposed. The proposed method in the third phase has several advantages over the previous methods. It is quite robust to skewness of musical group symbols. Furthermore, it provides high accuracy in recognition of the symbols.

Staff-line detection and removal using a convolutional neural network

Article

Full-text available

Aug 2017
MACH VISION APPL

Staff-line removal is an important preprocessing stage for most optical music recognition systems. Common procedures to solve this task involve image processing techniques. In contrast to these traditional methods based on hand-engineered transformations, the problem can also be approached as a classification task in which each pixel is labeled as either staff or symbol, so that only those that belong to symbols are kept in the image. In order to perform this classification, we propose the use of convolutional neural networks, which have demonstrated an outstanding performance in image retrieval tasks. The initial features of each pixel consist of a square patch from the input image centered at that pixel. The proposed network is trained by using a dataset which contains pairs of scores with and without the staff lines. Our results in both binary and grayscale images show that the proposed technique is very accurate, outperforming both other classifiers and the state-of-the-art strategies considered. In addition, several advantages of the presented methodology with respect to traditional procedures proposed so far are discussed.

Staff lines removal based on naive bayesian model

Conference Paper

May 2022

Combining Pattern Classifiers: Methods and Algorithms

Book

Jan 2004

Ludmila Kuncheva

Image Analysis Using Mathematical Morphology

Article

Jun 1983

Image Analysis and Mathematical Morphology: Vol. 1

Book

Jan 1982

Prior information in the design of optimal binary filters

Conference Paper

Jun 1998

The use of genetic algorithms in morphological filter design

Article

Jan 1996
SIGNAL PROCESS-IMAGE

Morphological filters are an important class of non-linear digital signal processing and analysis filters, having found a range of applications, giving excellent results in areas such as noise reduction, edge detection and object recognition. However, design methods existing for these morphological filters tend to be computationally intractable or require some expert knowledge of mathematical morphology. This paper demonstrates how simple genetic algorithms can be employed in the search for optimum morphological filters for specific signal/image processing tasks. Some examples of applying the method to some real noise-reduction tasks are shown.

Image Analysis and Mathematical Morphobgy

Article

Jan 1982

Jean Serra

The Elements Of Statistical Learning

Article

Jan 2001
ELEMENTS

Random Sets and Integral Geometry

Article

Jan 1976
J Roy Stat Soc

G. Matheron

Morphological Image Operators

Article

Jan 1994

Henk J. A. M. Heijmans

Hands-On Morphological Image Processing

Book

Jan 2003

Morphological image processing, now a standard part of the imaging scientist's toolbox, can be applied to a wide range of industrial applications. Concentrating on applications, this book shows how to analyze a problem and then develop successful algorithms based on the analysis. The book is hands-on in a very real sense: readers can download a demonstration toolbox of techniques and images from the web so they can process the images according to examples in the text.

Multilevel Training of Binary Morphological Operators

Abstract and Figures

Recommended publications

Submit your application to win an all-inclusive 11-days at Sao Paulo School of Advanced Sciences on...

Use finesse instead of a large hammer

Information Capacity and Power Efficiency in Operational Transconductance Amplifiers

Adaptive Factorization Network: Learning Adaptive-Order Feature Interactions

Alternative Discrete-Time Operators and Their Application to Nonlinear Models

Towards a Relational Programming Language