Conference PaperPDF Available

FLoSS: Facility Location for Subspace Segmentation

Authors:

Abstract

Subspace segmentation is the task of segmenting data lying on multiple linear subspaces. Its applications in computer vision include motion segmentation in video, structure-from-motion, and image clustering. In this work, we describe a novel approach for subspace segmentation that uses probabilistic inference via a message-passing algorithm. We cast the subspace segmentation problem as that of choosing the best subset of linear subspaces from a set of candidate subspaces constructed from the data. Under this formulation, subspace segmentation corresponds to facility location, a well studied operational research problem. Approximate solutions to this NP-hard optimization problem can be found by performing maximum-a-posteriori (MAP) inference in a probabilistic graphical model. We describe the graphical model and a message-passing inference algorithm. We demonstrate the performance of Facility Location for Subspace Segmentation, or FLoSS, on synthetic data as well as on 3D multi-body video motion segmentation from point correspondences.
FLoSS: Facility Location for Subspace Segmentation
Nevena Lazic
nevena@comm.utoronto.ca
Inmar Givoni
inmar@psi.utoronto.ca
Brendan Frey
frey@psi.utoronto.ca
Parham Aarabi
parham@ecf.utoronto.ca
University of Toronto, Dept. of Electrical and Computer Engineering
10 Kings College Road, Toronto ON, Canada, M5S 3G4
Abstract
Subspace segmentation is the task of segmenting data
lying on multiple linear subspaces. Its applications in
computer vision include motion segmentation in video,
structure-from-motion, and image clustering. In this work,
we describe a novel approach for subspace segmentation
that uses probabilistic inference via a message-passing al-
gorithm.
We cast the subspace segmentation problem as that of
choosing the best subset of linear subspaces from a set of
candidate subspaces constructed from the data. Under this
formulation, subspace segmentation reduces to facility lo-
cation, a well studied operational research problem. Ap-
proximate solutions to this NP-hard optimization problem
can be found by performing maximum-a-posteriori (MAP)
inference in a probabilistic graphical model. We describe
the graphical model and a message-passing inference algo-
rithm.
We demonstrate the performance of Facility Location for
Subspace Segmentation, or FLoSS, on synthetic data as well
as on 3D multi-body video motion segmentation from point
correspondences.
1. Introduction
Many statistical models used for data analysis in vision
assume that high-dimensional input data has an intrinsic
low-dimensional representation. Furthermore, many such
models assume the data can be well approximated as ly-
ing on a linear subspace. For instance, principal component
analysis (PCA) [13], independent component analysis [12],
factor analysis [10], and nonnegative matrix factorization
[16] are all highly popular methods that attempt to recover a
low dimensional linear representation of the data. Although
the linearity assumption is often inaccurate, it nevertheless
turns out to be a reasonable and useful approximation in
many cases [25,26,30]. Even in non-linear dimensionality
reduction, many methods assume that data is locally linear,
and can be represented as some configuration of local linear
subspaces [19,31].
In subspace segmentation, the underlying assumption is
that the data is composed of points lying on several distinct
linear subspaces, not necessarily of the same intrinsic di-
mension. The goal of subspace segmentation is to recover
the underlying subspaces and to assign the data points to
one subspace each. The toy data sets shown in Fig. 1il-
lustrate this idea. Thus, subspace segmentation is a more
flexible model compared to the single linear subspace rep-
resentation, but it still retains some of the computationally
favorable properties of linear subspace models.
Subspace segmentation has a variety of computer vision
applications. One example is clustering images of differ-
ent objects under varying illumination. It has been shown
in [11] that a set of images of a Lambertian object un-
der varying lighting conditions forms a convex polyhedral
cone in the image space, which is well-approximated by a
low dimensional subspace. As images of different objects
lie on different subspaces, subspace segmentation can be
used for clustering images. Another application is in 3D
multi-body video motion segmentation from point corre-
spondences. Given the image coordinates of several key-
points lying on a rigid object, undergoing motion over F
video frames, it can be shown that vectors (of length 2F)
of stacked point coordinates lie on a linear subspace of di-
mension 2, 3 or 4 [24,29]. When there are several moving
objects in the video, with tracked keypoints on each object,
the motion segmentation task is to cluster these points - an-
other instance of subspace segmentation.
In this work, we describe a novel subspace segmentation
method called Facility Location for Subspace Segmentation
(FLoSS), where we formulate subspace segmentation as an
instance of the facility location problem - a classical NP-
hard problem in combinatorial optimization and operational
1
Figure 1. Examples of data lying on multiple linear subspaces
research [17]. The facility location problem can be stated as
follows: given a set of customers, a set of potential facili-
ties that can be opened to serve customers, the cost of open-
ing each facility, and the distances between customers and
facilities, open a subset of facilities and assign customers
to one facility each, such that the sum of facility costs and
customer-facility distances is minimized.
FLoSS formulates subspace segmentation as facility lo-
cation by first constructing a large initial set of candidate
subspaces (or facilities), and assigning costs to them that
are based on their complexity (dimensionality). The can-
didate subspaces are initialized from the data by randomly
selecting D-tuples of linearly independent points, with 2
D D, where Dis the original data dimension. Each data
D-tuple defines a linear subspace of dimension (D1).
Given the normal distances of data points to candidate sub-
spaces and the costs for utilizing subspaces, subspace seg-
mentation is framed as finding the optimal subset of sub-
spaces that best explains the data, i.e. the subset that mini-
mizes the subspace costs and the point-subspace distances -
an instance of facility location.
We find approximate facility location solutions by us-
ing a message-passing algorithm on a factor-graph repre-
sentation of the problem. The approach is closely related to
Affinity Propagation [8], an exemplar-based clustering al-
gorithm.
2. Previous work
There exist numerous notable subspace segmentation al-
gorithms, having different underlying approaches to the
problem. When the number of subspaces is unknown, a
sensible approach is to search for them one at a time, and
select the one that represents a large number of points well
at each pass. One such algorithm is random sample con-
sensus (RANSAC) [6,23,28], a generic algorithm for out-
lier detection. RANSAC fits a (D1)-dimensional sub-
space by iteratively (1) constructing a basis from Dran-
domly sampled points, (2) computing the normal distance
from all points to this subspace, and (3) labeling those above
some distance threshold as outliers. This is repeated until a
specified number of inliers is reached, or a sufficient num-
ber of points have been sampled. Multiple subspaces are
found iteratively, by removing the inliers from the previ-
ous step and repeating. A similar idea - that of iteratively
searching for a subspace with the most inliers - is used by
Da Silva et al. [3]. They formulate this task as an uncon-
strained, but non-convex optimization problem, with im-
proved efficiency over RANSAC. Neither method provides
a direct way of estimating subspace dimensionalities. One
proposed solution is to start with the highest-dimensional
model, and recursively check each found solution for lower-
dimensional models [2]. An alternative is to simultaneously
apply the algorithm on multiple hypotheses and use model
selection [7,20].
When the number of subspaces and their dimensionali-
ties are specified, it is more intuitive to determine all sub-
spaces at once. One approach is to iterate between assigning
points to their nearest subspaces, and re-estimating the sub-
space bases from the assigned points. k-subspaces [11], an
extension of the k-means algorithm, iterates between mak-
ing hard assignments of points to subspaces based on min-
imal point-subspace normal distance, and re-computing the
subspace bases using PCA. Mixture of pPCA (mpPCA) [22]
makes this process probabilistic by using latent variables to
indicate the assignment of each point to one of kproba-
bilistic PCA models. The model parameters and the prob-
ability distribution over the latent variables are estimated
iteratively, using the Expectation Maximization (EM) algo-
rithm [4]. Both methods can be sensitive to initialization
and local optima.
Another possible approach, when the subspace number
and dimensionality are available, is to construct the solution
algebraically. Generalized PCA (GPCA) [27] represents a
union of ksubspaces embedded in <Dby a set of homo-
geneous polynomials of degree kin Dvariables. The poly-
nomial coefficients can be estimated linearly from the data.
The complexity of GPCA scales as kD, and the number of
data points needed to estimate polynomials is exponential
in k; hence, it is only practical for a small number of low-
dimensional subspaces. When the number of subspaces is
unavailable, the authors determine it by estimating the rank
of a matrix. A recursive approach similar to [2] can be used
when subspace dimensionalities are unknown.
Subspace separation (SS) [14] is also an algebraic ap-
proach. It relies on the observation that when the sub-
spaces are linearly independent and noise-free, it is possi-
ble to compute a binary data interaction matrix, indicating
whether two points lie on the same subspace or not. Noise
is addressed by using model selection to decide whether to
merge subspaces.
Overall, none of the methods provide an effective way of
estimating the number of subspaces and their dimensional-
ities. However, there exist applications in which subspace
structures are known beforehand, the most notable being
motion segmentation. In motion segmentation, the subspace
dimensionalities are 2, 3, or 4, and the possible challenges
posed by the data are well understood. Indeed, many sub-
space segmentation methods were actually designed as mo-
tion segmentation algorithms [14,21,29].
The multi-stage learning (MSL) algorithm of [21] for
motion segmentation refines the subspace segmentation re-
sults of SS using three stages of mpPCA of increasing com-
plexity, each corresponding to a different type of motion.
The simplest mpPCA model is initialized using SS, and the
results at each stage are used to initialize the next stage.
In this way, MSL accounts for the cases where SS fails,
namely, when the subspaces are co-dependent. This can oc-
cur frequently in motion data, especially when the motion
of the points is in part due to a moving camera.
Another multi-body motion segmentation method is lo-
cal subspace affinity (LSA) [29]. It is an algebraic method
that first projects points onto the first Rprincipal compo-
nents and then onto a hyper-sphere SR1. A local subspace
is fit around each point and its knearest neighbors. The
points are then clustered using spectral clustering [18] with
pairwise similarities computed using angles between the lo-
cal subspaces. Misclassification can occur near the inter-
section of two subspaces (as the nearest neighbors lie on
different subspaces), or when the nearest neighbors do not
span the selected subspace. Model selection is used to se-
lect appropriate subspace dimensionality, which is 2, 3 or
4. In comparison to previous methods, FLoSS is essen-
tially an iterative method that constructs subspaces from
randomly sampled D-tuples of data, similarly to RANSAC.
However, it considers all constructed subspaces (of possi-
bly different dimensionality) simultaneously, and selects all
ksubspaces at once. Its complexity does not depend on the
dimensionality of the original data; however, it increases
with the total number of subspaces provided at initializa-
tion. FLoSS requires subspace dimensionalities or their
range as inputs, and discovers the number of subspaces au-
tomatically.
3. A graphical model for subspace segmenta-
tion
3.1. Problem setup
Given Ndata points, we begin by creating a large set
of Mcandidate subspaces by randomly drawing sets of
D << N linearly independent points for different values
of Dwhere each set defines a (D1)-dimensional candi-
date subspace. We evaluate dnm, the squared normal dis-
tances from point nto subspaces mM. In addition, we
associate a cost cmwith each subspace m, which is set to
be the sum of all pairwise distances between points defin-
ing the subspace. The purpose of assigning costs is to pre-
vent overfitting the data with very high-dimensional sub-
spaces or with too many subspaces. Using the sum of all
pairwise distances is a sensible way of setting costs since it
assigns lower costs to lower-dimensional subspaces and to
subspaces generated by points that are close to one another.
Having generated the set of candidate subspaces, we
would now like to select a subset M Mof subspaces,
and associate each point with one subspace in M, such
that the sum of normal distances from points to their as-
signed subspaces and the total cost of subspaces in Mis
minimized. This optimization problem can be seen as an
instance of facility location (FL), a well studied NP-hard
problem. In FL, given a set of potential facilities (sub-
spaces, in our case), the goal is to select an optimal subset
of facilities and assign customers (data points) to one facil-
ity each such that the sum of facility costs and the distance
between customers to their assigned facilities is minimized.
Let xnm be a binary indicator variable, equal to 1if point
nis assigned to subspace mand 0otherwise, and let x=
{x11, . . . , xnm }be a vector of all the xnm variables . The
FL optimization problem can be stated as:
min
xX
mX
n
dnmxnm +X
m∈M
cm(1)
subject to X
m
xnm = 1 n(2)
M={m|X
n
xnm >0}(3)
xnm {0,1}(4)
The constraint (3) ensures that each point is assigned to
exactly one subspace.
3.2. Factor-graph representation and the max-sum
algorithm
The FL problem can be described in terms of a proba-
bilistic graphical model where each xnm is treated as a hid-
den binary random variable. The graphical model for the
problem is shown in Fig. 2, where we have used a factor-
graph notation [15]. Recall that a factor-graph is a bipartite
graph consisting of variable nodes and factor nodes. The
factor nodes evaluate potential functions over the variable
nodes they are connected to. The probability distribution
described by the graph is proportional to the product of the
factor potentials. The potential functions we associate with
the graphical model incorporate facility costs and customer-
facility distances, and enforce the constraint on the xnm
variables.
Given the following functions,
hnm(xnm ) = dnmxnm (5)
fm(x1m, . . . , xNm) = (cm,Pnxnm >0
0,otherwise. (6)
gn(xn1, . . . , xnM ) = (0,Pmxnm = 1
−∞,otherwise, (7)
the joint probability distribution can be written as
p(x)Y
m,n
exp(hnm(xnm )) (8)
×Y
m
exp(fm(x1m, . . . , xNm )) (9)
×Y
n
exp(gn(xn1, ..., xnM )) (10)
x11 ...
...
...
...
......
... ...
f1
h11
hn1
hN1
h1M
hnM
hNM
...
...
......
h1m
hnm
hNm
g1
x1m x1M
xn1
xN1
xnm xnM
xNm xNM
fmfM
gn
gN
Figure 2. Factor graph representation of p(x)
The functions hnm and fmaccount for distances and
costs, respectively, while the functions gnenforce the con-
straint that each point is assigned to exactly one subspace.
Maximizing log p(x)corresponds to the FL optimization
problem stated in Equation 1. Finding a solution to the
FL problem is carried out by finding maximum-a-posteriori
(MAP) estimates for the xnm using the max-product (belief
propagation) algorithm [15]. Max-product is a local mes-
sage passing algorithm known to converge to the MAP val-
ues of the variables on cycle-free graphs, and empirically
observed to give good results on graphs with cycles, as the
one described here. For notational convenience as well as
computational stability, we use the log-domain version of
the algorithm, max-sum. The general form of the max-sum
local messages between a function fand a variable xis [1]:
µfx(x) = max
x1,...,xK
[f(x, x1, . . . , xK)(11)
+X
xine(f)\x
µxif(xi)] (12)
µxf(x) = X
flne(x)\f
µflx(x)(13)
The messages are passed iteratively between function to
nodes and nodes to functions, and the algorithm is said to
converge once the message values no longer change. The fi-
nal variable assignment is based on the sum of all incoming
messages to a variable:
x= arg max
xX
flne(x)
µflx(x)(14)
When the messages are functions of binary random vari-
ables (as in the factor graph in Fig. 2), they are of length
two. However, in practice it suffices to only pass the dif-
ference between the two values, µ=µ(1) µ(0). For the
graph of Fig. 2, it can be shown that these messages are:
µhnmxnm =dnm (15)
µxnmgn=µfmxnm dnm (16)
µxnmfm=µgnxnm dnm (17)
µgnxnm =max
l6=mµxnlgn(18)
µfmxnm = min[0,cm+X
l6=n
max(0, µxlmfm)]
(19)
To find the MAP assignment, we only need to compute
the messages received by each xmn. By substituting in, we
can reduce the number of message updates to only these two
types:
ηnm µgnxnm = max
l6=m(dnl +αnl)(20)
αnm µfmxnm = min[0,cm+X
l6=n
max(0, ηlm dnm)](21)
We also note that all the ηmn and αmn message updates
can be done in a parallel fashion, i.e. without looping over
mand n. At convergence, we calculate the variable assign-
ments as
x
nm =(1 [ηnm +αnm dnm]>0
0otherwise. (22)
3.3. Relationship to affinity propagation
Affinity propagation (AP) [8] is an exemplar-based clus-
tering algorithm that selects data exemplars using local mes-
sage passing. It has been shown in [9] that affinity propaga-
tion can be represented using a factor graph that is similar
to Fig. 2, having binary random variables indicating the
membership of points to clusters. The differences between
AP and FL are in that (1) in AP, the available cluster cen-
ters are data points (as opposed to general facilities), and
(2) there is an additional constraint: if a point ichooses j
as its exemplar, jmust also choose itself as its exemplar. A
somewhat different derivation of facility location as an in-
stance of affinity propagation has been been applied in the
past to computational biology problems [5].
4. Experiments
We evaluate the performance of FLoSS on both syn-
thetic and real data sets. We use synthetic data to illustrate
and compare the performance of the subspace segmentation
methods FLoSS, RANSAC, mpPCA and GPCA on differ-
ent types of data. We then apply FLoSS to rigid body mo-
tion segmentation from video, and compare it both to the
above mentioned subspace segmentation methods, as well
as to the MSL and LSA motion segmentation algorithms.
We note that the number of subspaces kis not an input to
FLoSS. This is due to the FL formulation where the num-
ber of subspaces is automatically determined as a trade-off
between distances and costs. The value of kcan be con-
trolled indirectly by changing the value of the costs cm; in
general, lower costs will result in selecting a larger num-
ber of subspaces to use from the set of potential subspaces.
To compare FLoSS to methods that specify the number of
underlying subspaces k, we use the following procedure to
adjust the costs so that FLoSS finds exactly ksubspaces
If kF L < k, change costs to c0
m= 0.75cmand run
again.
If kF L > k, iteratively merge subspaces until kF L =
k. To decide on which subspace to merge, we form
akF L ×kF L matrix Fwhose entry Fij is the aver-
age normal distance of points assigned to subspace i
from subspace j;Fij =Pnxnidnj /Pnxni . We find
the smallest Fij such that i6=j, and merge the corre-
sponding subspaces.
4.1. Synthetic data
We first investigate the case of subspaces of the same
dimensionality. We generate several synthetic data sets by
sampling data points from planes in <3, and adding orthog-
onal Gaussian noise with variance at 5% of data variance.
For all algorithms, we specify the number of subspaces
kand their dimensionality. The segmentation results are
shown in Fig. 3, where the colors indicate subspace mem-
bership.
In general, RANSAC does not give very good results,
and its performance mainly depends on the number of it-
erations. mpPCA performs well on most data sets. How-
ever, as it is geared towards modeling mixtures of linear
segments rather than infinite subspaces, it may assign two
disjoint pieces of the same subspace to different mixture
components, as illustrated in the top row of Fig. 3. In addi-
tion, mpPCA can have difficulties distinguishing linear seg-
ments that overlap close to their means, as is the case for the
data shown in the middle row of Fig. 3. Although GPCA
gives good results on a variety of subspace configurations,
its performance degrades as the number of subspaces kin-
creases since the number of data points needed to estimate
subspaces is exponential in k. This explains its poor perfor-
mance on the 4-plane data set in the bottom row of Fig. 3.
GPCA can also be susceptible to noise; in fact, as noted in
[27], it is suboptimal compared to the other algorithms in
the Gaussian noise case when k > 1.
FLoSS gives very good results on the example configu-
rations. It treats subspaces as infinite, and its performance is
not affected by disconnected segments of the same subspace
or the point of intersection of several subspaces. Increasing
the number of subspaces kdoes not degrade its performance
either, although higher values of kmay require using more
facilities at initialization.
We illustrate a case where FLoSS may fail using a more
challenging data set, shown in Fig. 4. The data set contains
a plane and two co-planar lines, at two levels of noise: 1%
and 5% of data variance. We use a fixed dimensionality of
2 for mpPCA, RANSAC and GPCA1, and initialize FLoSS
with both 1Dand 2Dsubspaces.
On this data set, only mpPCA and GPCA correctly iden-
tify the subspaces, and only in the low-noise case. FLoSS,
on the other hand, groups the two lines into one plane
at both noise levels. In general, FLoSS prefers lower-
dimensional subspaces through lower costs. However, hav-
ing several densely sampled D-dimensional subspaces em-
bedded in a (D+ 1)-dimensional subspace may offset the
cost difference, causing FL to choose the (D+ 1) dimen-
sional subspace. As the structure of the subspaces is un-
known in general, it is difficult to set facility costs so as to
prevent this; a possible remedy could be the recursive ap-
proach of [2].
4.2. 3D motion segmentation
The 3D motion segmentation of points lying on rigidly
moving objects can be shown to correspond to segmen-
1Although it is possible to specify different dimensionalities for GPCA,
we found that fixed dimensionality gives better results using the code avail-
able at http://perception.csl.uiuc.edu/gpca/
(a) (b) (c) (d)
Figure 3. Comparison of different algorithms on data sets con-
sisting of planes, (a) RANSAC, (b) mpPCA, (c) GPCA, and (d)
FLoSS
(a) (b) (c) (d)
Figure 4. Mixed dimensionality subspaces, two noise levels: σ2=
0.01 (top row) and σ2= 0.05 (bottom row). (a) RANSAC, (b)
mpPCA, (c) GPCA, and (d) FLoSS
tation of linear subspaces [24,29]. Briefly, let {wfp
<2}f=1,...,F
p=1,...,P be the image projections of P3D points
{Xp P3}p=1,...,P , lying on a rigidly moving object, over
Fframes of a rigidly moving camera. Under the affine pro-
jection model, wf p =AfXp, where Af <2×4is the
affine camera matrix at frame f. Let W <2F×Pbe a
matrix whose columns are the 2D point trajectories. Then,
W2F×P=
A1
.
.
.
AF
2F×4X1· · · XP2F×4(23)
Therefore, the trajectories are embedded in a subspace of
dimension rank(W)that can be either 2, 3 or 4, depending
on the type of motion. When the points lie on multiple mov-
ing objects, the trajectories lie on multiple linear subspaces
of <2F; this observation is the basis of most rigid body 3D
motion segmentation algorithms.
Figure 5. Example frames with keypoints (left) and trajectories
(right) of checkerboard, traffic, and articulated motion sequences
from the Hopkins155 database. The keypoints colors denote hand
labeled objects.
A benchmark database for multi-body motion seg-
mentation from point correspondences is the Hopkins155
database [24]. The database contains 50 video sequences
of indoor and outdoor scenes, each containing two or three
motions. Additionally, the 35 three-motion videos are split
into 3
2groups containing only two out of three motions,
resulting in a total of 155 sequences. The data contains sub-
spaces of different dimensionalities. The three video types
that make up the database are:
Checkerboard: 104 video sequences with 2
checkerboard-pattern objects. The camera under-
goes rotation, translation, or both.
Traffic: 38 sequences of outdoor traffic scenes, taken
by a moving hand-held camera.
Articulated and non-rigid sequences: 13 video se-
quences of motions constrained by joints and non-rigid
motions.
Example frames from the three types of video sequences
are shown in Fig. 5.
We used the Hopkins155 database to evaluate the mo-
tion segmentation performance of the subspace segmenta-
tion models specified in Section 4.1, as well as that of two
motion segmentation algorithms: LSA and MSL. The num-
ber of objects in each sequence was specified for all algo-
rithms.
Except for FLoSS and mpPCA, the reported results were
obtained from [24], where the following settings were used:
GPCA was run on the first 5 principal components of the
data matrix W, and LSA was run on the first kprincipal
components, where kwas the number of objects present.
For RANSAC, the dimension of all subspaces was assumed
to be 4; the algorithm was run 1000 times on each sequence,
and the average results were recorded. mpPCA was run on
the first 12 principal components of W, and the subspace
dimensionality was set to 4. FLoSS was also run on the first
12 principal components of W, and initialized with random
subsets of 3, 4 and 5 points (corresponding to subspaces of
dimension 2, 3, and 4).
The segmentation errors, calculated as the percentage of
misclassified points, are summarized in Tables 1and 2. We
note that no single method outperforms all others for all data
sets. While GPCA achieves very good results for the 2 ob-
jects data, it performs poorly for the 3 objects data. As for
the motion segmentation algorithms, LSA performs well,
although inconsistently; while it is one of the best meth-
ods for the checkerboard sequences, it has the worst per-
formance on traffic. MSL also performs well overall, no-
tably better than mpPCA. Recall that MSL consists of three
stages of mpPCA, initialized using the subspace separation
algorithm, and adapted to different types of motion includ-
ing degenerate. The large gap in the performance of the
two methods is an indication of the sensitivity of mpPCA to
initialization and variable subspace dimensionality.
FLoSS outperforms all other methods on the traffic se-
quences, and achieves comparable results on the checker-
board and articulated motion sequences. The FLoSS er-
ror median is typically low; however, some large errors do
occur, most frequently as a consequence of choosing the
wrong subspace dimensionality. This is illustrated in Fig.
6, which shows the first 3 principal components of data
corresponding to the checkerboard sequence2shown in Fig.
5. Here, instead of a higher-dimensional subspace, FLoSS
chooses two lower-dimensional subspaces embedded in it.
GPCA and LSA correctly group the two embedded sub-
spaces. On the other hand, FLoSS outperforms other meth-
ods on data that contains two disjoint parts of the same sub-
space, such as the data shown in Fig. 7, corresponding to
the traffic sequence3shown in Fig. 5. In this case, LSA fails
due to the non-local structure, and GPCA fails because very
few points lie on two of the three groups. Such cases oc-
2The sequence 1RT 2RT CRT B
3The sequence cars207
cur more frequently in traffic data when a large number of
keypoints are detected on disjoint pieces of the background
(due to, for example, trees and grass), in contrast to only
a few keypoints per car. In comparison to the other non-
motion segmentation specific methods (RANSAC, mpPCA,
and GPCA) FLoSS is either better (the traffic and articu-
lated motion data for 3 objects), or performs very closely
to the best method (GPCA for 2 objects checkerboard and
articulated motion, mpPCA for 3 objects checkerboard).
(a) (b) (c) (d)
Figure 6. Checkerboard sequence, first 3 principal components. (a)
Ground truth, (b) FLoSS, (c) GPCA, and (d) LSA
(a) (b) (c) (d)
Figure 7. Traffic sequence, first 3 principal components. (a)
Ground truth, (b) FLoSS, (c) GPCA, and (d) LSA
5. Conclusions and future work
We described a new subspace segmentation method
that discovers linear subspaces in data using a message
passing algorithm. We demonstrated its advantages over
other methods on synthetic geometrical data, and evaluated
its performance on multi-body motion segmentation from
video.
The presented framework for subspace segmentation
suggests numerous future work directions. We have de-
scribed a way of finding an approximate facility location
solution using the max-sum algorithm, and formulated sub-
space segmentation as an instance of facility location. The
same approach could be applied to any other task that can
be formulated as facility location, with different distances
and costs. In addition, it is possible to adopt alternative
approaches to the suggested method for discovering a can-
didate set of subspaces such as an iterative refinement pro-
cedure to re-sample candidate subspaces by using, for in-
stance, PCA on points assigned to each subspaces.
References
[1] C. Bishop. Pattern Recognition and Machine Learning (Information
Science and Statistics). Springer-Verlag New York, Inc., Secaucus,
NJ, USA, 2006. 4
Error RANSAC mpPCA GPCA FLoSS LSA MSL
Checkerboard Average 6.52 9.89 6.09 7.70 2.57 4.46
Median 1.75 2.49 1.03 1.23 0.27 0.00
Traffic Average 2.55 21.41 1.41 0.14 5.43 2.23
Median 0.21 17.61 0.00 0.00 1.48 0.00
Articulated Average 7.25 25.13 2.88 4.69 4.10 7.23
Median 2.64 19.44 0.00 1.30 1.22 0.00
Table 1. Motion segmentation percent error, 2 objects
Error RANSAC mpPCA GPCA FLoSS LSA MSL
Checkerboard Average 25.7 15.44 31.95 16.45 5.80 10.38
Median 26.01 12.71 32.93 16.79 1.77 4.61
Traffic Average 12.83 37.02 19.83 0.29 25.07 1.80
Median 11.45 30.89 19.55 0.00 23.79 0.00
Articulated Average 21.38 53.12 16.85 8.51 7.25 2.71
Median 21.38 53.12 16.85 8.51 7.25 2.71
Table 2. Motion segmentation percent error, 3 objects
[2] O. Chum, T. Werner, and J. Matas. Two-view geometry estimation
unaffected by a dominant plane. CVPR, 2005. 2,5
[3] N. da Silva and J. Costeira. Subspace segmentation with outliers: A
grassmannian approach to the maximum consensus subspace. CVPR,
1:1–6, 2008. 2
[4] A. Dempster, N. Laird, and D. Rubin. Maximum likelihood from
incomplete data via the em algorithm. Journal of the Royal Statistical
Society, Series B, 39:1–38, 1977. 2
[5] D. Dueck, B. Frey, N. Jojic, G. G. V. Jojic, A. Emili, G. Musso, and
R. Hegele. Constructing treatment portfolios using affinity propa-
gation. Research in Computational Molecular Biology (RECOMB),
4955:360–371, 2008. 5
[6] M. A. Fischler and R. C. Bolles. Random sample consensus: a
paradigm for model fitting with applications to image analysis and
automated cartography. Communications of the ACM, 24(6):381–
395, 1981. 2
[7] D. Forsyth, J. Haddon, and S. Ioffe. The joy of sampling. IJCV,
41:109–134, 2001. 2
[8] B. Frey and D. Dueck. Clustering by passing messages between data
points. Science, 315(5814):972–976, 2007. 2,5
[9] I. Givoni and B. Frey. A binary variable model for affinity propaga-
tion. Neural Computation, 21:1–12, 2009. 5
[10] R. Gorsuch. Factor analysis. Lawrence Erlbaum, Hillsdale NJ, 1983.
1
[11] J. Ho, M.-H. Yang, J. Lim, K.-C. Lee, and D. Kriegman. Clustering
appearances of objects under varying illumination conditions. CVPR,
1:11–18, 2003. 1,2
[12] A. Hyvarinen, J. Karhunen, and E. Oja. Independent Component
Analysis. J. Wiley, New York, 2001. 1
[13] I. Jolliffe. Principal component analysis. Springer Series in Statis-
tics, Berlin, 1986. 1
[14] K. Kanatani. Motion segmentation by subspace separation and model
selection. In Proc. 8th ICCV, pages 586–591, 2001. 2,3
[15] F. Kschischang, B. Frey, and H.-A. Loeliger. Factor Graphs and the
Sum-Product Algorithm. IEEE Transactions on Information Theory,
47(2):498 519, 2001. 3,4
[16] D. Lee and H. Seung. Algorithms for non-negative matrix factoriza-
tion. In NIPS, pages 556–562, 2000. 1
[17] M. Mahdian, Y. Ye, and J. Zhang. Improved approximation algo-
rithms for metric facility location problems. In In Proc. of the 5th
Int’l. Workshop on Approximation Algorithms for Combinatorial Op-
timization, pages 229–242, 2002. 2
[18] A. Ng, Y. Weiss, and M. Jordan. On spectral clustering: analysis and
an algorithm. In NIPS 14, pages 849–856. MIT Press, 2001. 3
[19] S. Roweis and L. Saul. Nonlinear dimensionality reduction by locally
linear embedding. Science, 290(5500):2323–2326, 2000. 1
[20] K. Schindler and D. Suter. Two-view multibody structure-and-
motion with outliers. CVPR, 2005. 2
[21] Y. Sugaya and K. Kanatani. Geometric structure of degeneracy for
multi-body motion segmentation. Workshop on Statistical Methods
in Video Processing, 2004. 3
[22] M. Tipping and C. Bishop. Mixtures of probabilistic principal com-
ponent analyzers. Neural Computation, 11(2):443–482, 1999. 2
[23] P. Torr. Geometric motion segmentation and model selection. Phil.
Trans. Royal Society of London, 356:1321–1340, 1998. 2
[24] R. Tron and R. Vidal. A benchmark for the comparison of 3-d motion
segmentation algorithms. CVPR, 1:1–8, 2007. 1,6,7
[25] M. Turk and A. Pentland. Face recognition using eigenfaces. CVPR,
1:586–591, 1991. 1
[26] R. Urtasun, D. Fleet, and P. Fua. Temporal motion models for
monocular and multiview 3d human body tracking. Computer Vi-
sion and Image Understanding, 104(2):157–177, 2006. 1
[27] R. Vidal, Y. Ma, and S. Sastry. Generalized principal component
analysis (gpca). IEEE Trans. PAMI, 27(12):1945–1959, 2005. 2,5
[28] A. Yang, S. Rao, and Y. Ma. Robust statistical estimation and seg-
mentation of multiple subspaces. In CVPR workshop on 25 years of
RANSAC, 2006. 2
[29] J. Yanv and M. Pollefeys. A general framework for motion segmenta-
tion: Independent, articulated, rigid, non-rigid, degenerate and non-
degenerate. ECCV, 3954:94–106, 2006. 1,3,6
[30] J. Zhang, Y. Yan, and M. Lades. Face recognition : Eigenface, elastic
matching, and neural nets : Automated biometrics. In In Proceedings
of the IEEE, volume 85(9), pages 1423–1435, 1997. 1
[31] Z. Zhang. Principal manifolds and nonlinear dimension reduction
via local tangent space alignment. SIAM Journal of Scientific Com-
puting, 26:313–338, 2004. 1
... However, SSAP needs to iteratively calculate a similarity matrix, which increases the computational cost. A facility location for subspace segmentation (FLOSS) algorithm [28] is proposed to handle the robust geometric model fitting problem by considering the trade-off between facilities (i.e., the model complexity) and customers (i.e., the data integrity). Unfortunately, FLOSS requires an extra outlier detection algorithm to deal with contaminated data such as outliers. ...
Preprint
In this paper, we propose a novel hierarchical representation via message propagation (HRMP) method for robust model fitting, which simultaneously takes advantages of both the consensus analysis and the preference analysis to estimate the parameters of multiple model instances from data corrupted by outliers, for robust model fitting. Instead of analyzing the information of each data point or each model hypothesis independently, we formulate the consensus information and the preference information as a hierarchical representation to alleviate the sensitivity to gross outliers. Specifically, we firstly construct a hierarchical representation, which consists of a model hypothesis layer and a data point layer. The model hypothesis layer is used to remove insignificant model hypotheses and the data point layer is used to remove gross outliers. Then, based on the hierarchical representation, we propose an effective hierarchical message propagation (HMP) algorithm and an improved affinity propagation (IAP) algorithm to prune insignificant vertices and cluster the remaining data points, respectively. The proposed HRMP can not only accurately estimate the number and parameters of multiple model instances, but also handle multi-structural data contaminated with a large number of outliers. Experimental results on both synthetic data and real images show that the proposed HRMP significantly outperforms several state-of-the-art model fitting methods in terms of fitting accuracy and speed.
... The clustering algorithm we selected for this research is the Affinity Propagation (Dueck, 2008a), (Dueck, 2007a), one of the de facto standard clustering nowadays widely used in several applications (Dueck, 2008b), (Dueck, 2007b), (Lazic, 2009). The two main reasons for which we selected it instead of others (like k-means (David, 2007), DBSCAN (Ester, 1996), Spectral Graphs (Chun, 1997)) is the fact that it is an unsupervised technique where the number of clusters is not specified by the user, and because it has really few parameters to be fine-tuned, mainly one, and this helps a lot for the setup of a new system reducing its calibration. ...
Article
Full-text available
Short-term prediction of traffic flows is an important topic for any traffic management control room. The large availability of real-time data raises not only the expectations for high accuracy of the forecast methodology, but also the requirements for fast computing performances. The proposed approach is based on a real-time association of the latest data received from a sensor to the representative daily profile of one among the clusters that are built offline based on an historical data set using Affinity Propagation algorithm. High scalability is achieved ignoring spatial correlations among different sensors, and for each of them an independent model is built-up. Therefore, each sensor has its own clusters of profiles with their representatives; during the short-term forecast operation the most similar representative is selected by looking at the last data received in a specified time window and the proposed forecast corresponds to the values of the cluster representative.
... F ACILITY location problems, as their name indicates, deal with selecting the best location for placing facilities in order to best meet the demanded constraints, where fixed costs exist for every possible facility's location. Facility location problems apply to a variety of fields, e.g., operational research [123], [124], computational biology [125], computer vision [126], [127], data mining [128], network design [129], etc. ...
Thesis
Full-text available
Prolonging a wireless sensor network’s lifetime is closely related to energy consump-tion and particularly to the energy hole problem, where sensor nodes close to the sink node consume a considerable amount of their energy for relaying purposes. In order to tackle the energy hole problem’s eects, this thesis proposes two approaches that counter the problem from two perspectives: (i) the minimization of the energy consumption by approaching the sink placement problem as a k-median problem and (ii) the prolongation of the network’s lifetime by recharging its sensor nodes.In the rst approach, an analytical model for analyzing the available energy in the network is proposed. The next step is to analytically model the overall energy consumption as a k-median facility location problem, its solution corresponding to the location of k sinks in the network. As analytically shown, when k sinks are placed according to the solution of the previous facility location problem, then the overall energy consumption is minimized, resulting in a higher energy-saving system. Thus, the saved energy can be further utilized, e.g., to extend the network’s lifetime and support modern replenishing techniques such as energy harvesting and battery recharging. Simulation results validate the analytical model that is the basis of the analysis and conrm the results with respect to the available energy in the network. In particular, signicant energy savings are observed when the analytical results are applied, thus resulting in better energy utilization and subsequent network lifetime increment.The second approach is focused on two proposed recharging policies. The rst one is a simple recharging policy that permits a mobile recharger, initially stationed at the sink node, to move around and replenish any node’s exhausted battery when a certain recharging threshold is violated. This policy, as well as the second pro-posed recharging policy (i.e., the enhanced recharging policy), refer to on-demand recharging policies which base their operation on local information, allowing the mobile recharger to move – upon request – to a node of reduced energy level and re-plenish its battery. When under the enhanced recharging policy and after completing the latter replenishment, the mobile recharger continues operating in a hop-by-hop manner to the neighbor nodes of the lowest energy level, thus replenishing their batteries too. It is shown that the minimization of the recharging distance covered by the mobile recharger is a facility location problem, and particularly an 1-median one. Simulation results, regarding the simple recharging policy, investigate various aspects of it related to the recharging threshold and the level of the energy left in the network nodes’ batteries. In addition, it is shown that when the sink’s location is set to the solution of the particular facility location problem, then the recharging distance is minimized irrespectively of the recharging threshold. As for the enhanced recharging policy’s simulation results, its eectiveness is investigated using simula-tion results and compared against an existing well-known on-demand recharging policy that exploits global knowledge (i.e., knowledge of both the energy level of all nodes and the network topology). It is shown that the enhanced recharging policy, even though based on local information, maintain the average energy level and termination time higher than that under the existing one that exploits global knowledge. Furthermore, it is observed that the network’s lifetime is maximized when the basis of the mobile recharger is located at the solution of the mentioned median problem for all studied policies.The approaches studied in this thesis establish a relation between facility location problems (particularly the k-median problem) and energy consumption and battery replenishment. This is a signicant contribution that is expected to trigger future work in the area and reveal further aspects of the energy consumption issues and how lifetime may be prolonged in wireless sensor networks.
Article
Full-text available
In order to solve the uncapacitated facility location problem (UFLP) quickly and effectively, an enhanced group theory-based optimization algorithm (EGTOA) is proposed in this paper. Firstly, a new local search operator, One Direction Mutation Operator, is proposed, which is suitable for solving UFLP. Secondly, a Redundant Checking Strategy is presented to further optimize the quality of feasible solutions. To verify the performance of EGTOA, 15 benchmark instances of UFLP is selected in OR-Library, the comparison results with the 16 existing algorithms show that the solution obtained by EGTOA is better than other algorithms, moreover its speed is much faster than state-of-the-art algorithms. These demonstrates that EGTOA is a fast and effective algorithm for solving UFLP.
Article
We study the network facility location problem with constraints on the capacities of communication lines, called Restricted Facility Location Problem (RFLP). It is required to locate facilities at the vertices of a given network graph so as to simultaneously satisfy at minimum cost the demands of customers located at the vertices of the graph. We consider two statements of the problem: the multiple allocation RFLP, where the demand of a customer can be satisfied jointly by several facilities, and the single allocation RFLP, where the demand of a customer must be entirely satisfied by a single facility. We show that the single allocation RFLP is NP-hard even if the network is a simple path and strongly NP-hard if the network is a tree. The multiple allocation RFLP is weakly NP-hard on trees. For this problem, we propose a pseudopolynomial-time algorithm for the case where the network graph has constant treewidth and a linear-time algorithm for the case where the network is a simple path.
Article
In this paper, we propose a novel hierarchical representation via message propagation (HRMP) method for robust model fitting, which simultaneously takes advantages of both the consensus analysis and the preference analysis to estimate the parameters of multiple model instances from data corrupted by outliers, for robust model fitting. Instead of analyzing the information of each data point or each model hypothesis independently, we formulate the consensus information and the preference information as a hierarchical representation to alleviate the sensitivity to gross outliers. Specifically, we firstly construct a hierarchical representation, which consists of a model hypothesis layer and a data point layer. The model hypothesis layer is used to remove insignificant model hypotheses and the data point layer is used to remove gross outliers. Then, based on the hierarchical representation, we propose an effective hierarchical message propagation (HMP) algorithm and an improved affinity propagation (IAP) algorithm to prune insignificant vertices and cluster the remaining data points, respectively. Experimental results on both synthetic data and real images show that the proposed HRMP significantly outperforms several state-of-the-art model fitting methods in terms of fitting accuracy and speed.
Article
Three-dimensional reconstruction of line and plane structures from two images is a major task in urban building modeling. However, traditional line segment (LS) matching methods frequently produce inaccurate few LS matches, and further lead to unreliable sparse 3D line-plane reconstruction. To address these issues, this paper presents an effective line-plane reconstruction method based on angle regularization. The proposed method first performs LS matching by learning the angles between planes using convolutional neural networks (CNNs). Angle regularization is used to correct unreliable LS matches and infer progressively potential 3D LSs for unmatched ones. Then, the resulting 3D LSs and planes are globally regularized by incorporating geometric constraints, image features, and plane and angle regularity terms under a unified optimization framework. Experiments on several standard datasets demonstrate that our method has clear advantages over the state-of-the-art methods.
Article
Most robust estimators require tuning the parameters of the algorithm for the particular application, a bottleneck for practical applications. The paper presents the Multiple Input Structures with Robust Estimator (MISRE), where each structure, inlier or outlier, is processed independently. The same two constants are used to find the scale estimates over expansions for each structure. The inlier/outlier classification is straightforward since the data is processed and ordered with the relevant inlier structures listed first. If the inlier noises are similar, MISRE's performance is equivalent to RANSAC-type algorithms. MISRE still returns the correct inlier estimates when inlier noises are very different, while RANSAC-type algorithms do not perform as well. MISRE's failures are gradual when too many outliers are present, beginning with the least significant inlier structure. Examples from 2D images and 3D point clouds illustrate the estimation.
Article
Algorithms that must deal with complicated global functions of many variables often exploit the manner in which the given functions factor as a product of “local” functions, each of which depends on a subset of the variables. Such a factorization can be visualized with a bipartite graph that we call a factor graph, In this tutorial paper, we present a generic message-passing algorithm, the sum-product algorithm, that operates in a factor graph. Following a single, simple computational rule, the sum-product algorithm computes-either exactly or approximately-various marginal functions derived from the global function. A wide variety of algorithms developed in artificial intelligence, signal processing, and digital communications can be derived as specific instances of the sum-product algorithm, including the forward/backward algorithm, the Viterbi algorithm, the iterative “turbo” decoding algorithm, Pearl's (1988) belief propagation algorithm for Bayesian networks, the Kalman filter, and certain fast Fourier transform (FFT) algorithms
Chapter
Principal component analysis has often been dealt with in textbooks as a special case of factor analysis, and this tendency has been continued by many computer packages which treat PCA as one option in a program for factor analysis—see Appendix A2. This view is misguided since PCA and factor analysis, as usually defined, are really quite distinct techniques. The confusion may have arisen, in part, because of Hotelling’s (1933) original paper, in which principal components were introduced in the context of providing a small number of ‘more fundamental’ variables which determine the values of the p original variables. This is very much in the spirit of the factor model introduced in Section 7.1, although Girschick (1936) indicates that there were soon criticisms of Hotelling’s method of PCs, as being inappropriate for factor analysis. Further confusion results from the fact that practitioners of ‘factor analysis’ do not always have the same definition of the technique (see Jackson, 1981). The definition adopted in this chapter is, however, fairly standard.
Book
A comprehensive introduction to ICA for students and practitionersIndependent Component Analysis (ICA) is one of the most exciting new topics in fields such as neural networks, advanced statistics, and signal processing. This is the first book to provide a comprehensive introduction to this new technique complete with the fundamental mathematical background needed to understand and utilize it. It offers a general overview of the basics of ICA, important solutions and algorithms, and in-depth coverage of new applications in image processing, telecommunications, audio signal processing, and more.Independent Component Analysis is divided into four sections that cover:* General mathematical concepts utilized in the book* The basic ICA model and its solution* Various extensions of the basic ICA model* Real-world applications for ICA modelsAuthors Hyvarinen, Karhunen, and Oja are well known for their contributions to the development of ICA and here cover all the relevant theory, new algorithms, and applications in various fields. Researchers, students, and practitioners from a variety of disciplines will find this accessible volume both helpful and informative.