Conference PaperPDF Available

Extraction of Planar Features from Swissranger SR-3000 Range Images by a Clustering Method Using Normalized Cuts

Authors:

Abstract

This paper describes a new approach to extract planar features from 3D range data captured by a range imaging sensor-the SwissRanger SR-3000. The focus of this work is to segment vertical and horizontal planes from range images of indoor environments. The method first enhances a range image by using the surface normal information. It then partitions the Normal Enhanced Range Images (NERI) into a number of segments using the Normalized-Cuts (N-Cuts) algorithm. A least-square plane is fit to each segment and the fitting error is used to determine if the segment is planar or not. From the resulting planar segments, each vertical or horizontal segment is labeled based on the normal of its least-square plane. A pair of vertical or horizontal segments is merged if they are neighbors. Through this region growing process, the vertical and horizontal planes are extracted from the range data. The proposed method has a myriad of applications in navigating mobile robots in indoor environments.
Abstract—This paper describes a new approach to extract
planar features from 3D range data captured by a range
imaging sensorthe SwissRanger SR-3000. The focus of this
work is to segment vertical and horizontal planes from range
images of indoor environments. The method first enhances a
range image by using the surface normal information. It then
partitions the Normal Enhanced Range Images (NERI) into a
number of segments using the Normalized-Cuts (N-Cuts)
algorithm. A least-square plane is fit to each segment and the
fitting error is used to determine if the segment is planar or
not. From the resulting planar segments, each vertical or
horizontal segment is labeled based on the normal of its
least-square plane. A pair of vertical or horizontal segments is
merged if they are neighbors. Through this region growing
process, the vertical and horizontal planes are extracted from
the range data. The proposed method has a myriad of
applications in navigating mobile robots in indoor
environments.
I. INTRODUCTION
NDOOR environments commonly consists of regular
structures such as, stairways, hallways, and doorways,
etc.. To operate efficiently in such conditions it is import
for a mobile robot to identify these structures and deal with
them. For instance, a tracked robot may use the information
to traverse steps and stairways. Extracting and recognizing
these structures is also useful in building a symbolic map of
the environment. A robot may use these structures as
landmarks for navigation and localize itself. Since these
indoor structures are often constituted by planar surfaces,
efficient planar feature extraction becomes an essential
capability for the robot. Pattern recognition algorithm built
on the plane extraction method may allow the robot to
group the planar surfaces into structures and identify them
based on their geometric constituent. For instance, a
stairway can be characterized by an occurrence of alternate
horizontal (treads) and vertical (raisers) planes and a floor
can be characterized by a large horizontal plane. Also,
extracting planar segments is an important problem in
range data processing, and it serves a number of purposes.
First, the information about where two planes intersect in
3D space can be used to extract prominent linear features
such as corners in a room. These features are important for
registering multiple scans of range data or to register range
data with 2D images.
Researchers have addressed the problem of planar
feature extraction from range data either in the original
Manuscript received March 1, 2009. This work was supported in part
by NASA and the Arkansas Space Grant Consortium under grants
UALR18800; by the NASA EPSCoR RID Award, and a matching fund
from the Arkansas Science and Technology Authority.
C. Ye is with the Department of Applied Science, University of
Arkansas at Little Rock, Little Rock, AR 72204, USA (phone:
501-683-7284; fax: 501-569-8020; e-mail: cxye@ualr.edu).
G. Hegde is with the same department (e-mail: gmhegde@ualr.edu).
input domain (3D point cloud) [1,2,3,4] or by representing
the range data as an image [5,6]. Venable and Uijit de Haag
[3] propose a so-called histogramming method for planar
surface extraction for the SR-3000. The method first
divides a range image into a number of sub-images equally.
It then fits a least-square plane to the set of 3D data points
belonging to each sub-image, and the plane with the
smallest fitting error is chosen as a candidate planar feature.
A histogram of the distances d from the rest of the data
points to the candidate plane is computed. Data points that
are closely located at the surrounding of d=0 and d=D in the
histogram are classify as points in the candidate plane and
points in a parallel plane, respectively. The advantage of
the method is its real-time performance. The limitation is
that it can not be applied to a scenario where planes have
multiple orientations (e.g., perpendicular planes). In
addition, the set of data points in a sub-image with the
minimum fitting error does not necessarily form a planar
surface. Stamos and Allen’s method [4] identifies planar
structures from 3D range data of a precision laser scanner
by dividing the data into k×k patches and merging the
planar patches based on plane-fitting statistics. A patch is
classified as a locally planar patch if the plane-fitting error
is below a threshold. Two locally planar patches are
considered to be in a same planar surface if they have
similar orientations and are close in 3D space. The
plane-fitting based classification method is sensitive to the
threshold value. In addition it is not easy to determine an
appropriate patch sizea trade-off between computational
cost and the granularity of data segmentation.
In this work we use the SwissRanger SR-3000 imaging
sensor [7,8] as we are investigating the plane extraction
problem for possible application of navigating a small
mobile robot in indoor environments. In this case the
SR-3000 is advantageous over a LADAR. It has a much
higher data throughput25344 points per frame and up to
50 frames per second, and is much smaller in size (50×48
×65 mm3). Also the SR-3000 works well in featureless
environments which is a big advantage over a stereovision
system. However, the SR-3000’s sensing technology is
nascent and its range data has relatively large measurement
errors (much bigger than that of a LADAR [9]) due to
random noise (e.g., thermal noise, photon short noise) and
environmental factors (e.g., surface reflectivity). Previous
research efforts [ 10 , 11 ] have demonstrated a proper
calibration process may reduce the errors in the SR-3000’s
range data to certain extent. However, it can not eliminate
the errors induced by random noise. In [12] the authors of
this paper developed a Singular Value Decomposition
(SVD) filter to deal with the noise in the Normal Enhanced
Range Image (NERI) of the SR-3000. The SVD filter
demonstrates some success in smoothing the surface.
However, there is still certain amount of corruption in the
Extraction of Planar Features from Swissranger SR-3000 Range
Images by a Clustering Method Using Normalized Cuts
GuruPrasad M. Hegde and Cang Ye, Senior Member, IEEE
I
NERI. In such a case, a pixel-by-pixel region growing
method can not perform segmentation very well as it is
susceptible to disturbance in local features. It is required to
use a global criterion to segment a NERI. The criterion
must take into account both dissimilarity between the
segments as well as the total similarity within the segments
(i.e., among image pixels).
In this paper we present a new range image segmentation
method based on the Normalized Cuts (NC) method [13].
The NC method was originally proposed for the
segmentation of intensity images and it uses the total
dissimilarity between groups and the total similarity within
the groups to partition an image. It may result in
inappropriate grouping of pixels in case that an object does
not have distinctive dissimilarity from the background.
This problem may be alleviated in segmenting a range
image since additional metrics, such as the surface normal
of the least-square plane to the data points of a segment and
the fitting error, may be used to evaluate the correctness of
the segmentation.
The reminder of this paper is organized as follows: In the
following section we briefly describe briefly the NC
method. In section III, we explain our proposed method for
extracting planar features from range data. In Section IV,
we present experimental results followed by section V
where we discuss a recursive method to extract planar
pixels from misclassified clusters. The paper is concluded
in section VI where we discuss some directions for our
future work.
II. IMAGE SEGMENTATION USING NORMALIZED CUTS
A. Image segmentation as a graph partitioning problem
Image segmentation can be modeled as graph
partitioning problem. An image is represented as a
weighted undirected graph (, )GVE wherein each pixel
is considered as a node i
V and an edge ij
E
is formed
between each pair of nodes. The weight for each edge is
recorded in a Pixel Similarity Matrix (PSM) calculated as a
function of similarity between each pair of nodes. In
partitioning an image into various disjoint sets of pixels or
segments 123
, , ,..., m
VVV V , the goal is to maximize the
similarity of nodes in a subset i
V and minimize the
similarity across different sets
j
V. For the NC algorithm
the optimal bipartition of a graph into two sub-graphs A and
B is the one that minimizes the Ncut value given by:
),(
),(
),(
),(
),( VBassoc
BAcut
VAassoc
BAcut
BANcut , (1)
where
(,)
(,) (,)
uAvB
cut A B w u v

is the dissimilarity
between A and B, and (, )wi j is the weight calculated as a
function of the similarity between nodes i and
j
.
(, )assoc A V is the total connection from nodes in
A
to all
nodes in V. (, )assoc B V is defined similarly. From (1) we
can see that a high similarity among nodes in A and a low
similarity across different sets A and B can be maintained
by the minimization process. Given a partition of nodes that
separates a graph Vinto two sets
A
and B, let
x
be an
N= | |Vdimensional indicator vector, i
x
= 1 if node i is
in
A
and -1, otherwise. Let ( , )
i
j
dwij be the total
connection from node i to all other nodes. With the above
definition, (, )Ncut A B in (1) can be calculated. According
to [13] an approximate discrete solution to minimize
(, )Ncut A B can be obtained by solving the following
equation:
Dyy
yWDy
xNcut T
T
yx
)(
min)(min
, (2)
where 12
( , ,..., ), ( , ),
ni
j
D
diag d d d d w i j
[],
ij
Ww
and
00
(1 ) (1 )
ii
ii
xx
yx xdd

 
. If
( is a set
of real numbers), then (2) can be minimized by solving the
following generalized Eigen value system:
DyyWD
)( (3)
B. Grouping Algorithm
The grouping of pixels in an image I consists of the
following steps:
a) Consider image
I as an undirected graph (, )GVE
and construct a PSM. As stated before, each element of
the PSM is the weight of edge (, )wi j and is calculated
by
2
2() ( ) 2
2
22
|| ||
|| ( ) ( ) ||
(, ) exp *exp ij
IX
XX
Fi F j
wi j



if
2
|| ( ) ( ) ||Xi X j r
and (, ) 0wi j , otherwise. Here,
()Xi is the spatial location of node i,() ()Fi Ii is the
brightness value of pixel i. It is noted that (, ) 0wi j
for any pair of nodes ,ij that is greater than r pixels
apart. The reason for calculating (, )wi j in such a
manner is substantiated by the following argument: any
two pixels that have similar brightness value and are
spatially nearer belong to the same object more possibly
than two pixels with different brightness values and are
distant from each other.
b) Solve (3) for the Eigenvectors with the smallest Eigen
values.
c) Use the Eigen vector with the second smallest Eigen
value to bipartition the image by finding the splitting
points such that its Ncut value is minimized.
d) Recursively re-partition the segments (go to step a)
e) Exit if Ncut value for every segment is over some
specified threshold.
III. THE PROPOSED METHOD
In this work we adopt the method in [14] for range image
enhancement. The authors of this paper demonstrate in [12]
that the use of surface normal in the SR-3000’s range
images make the surfaces and edges of an object more
distinct. We first construct a tri-band color image where
each pixel’s RGB values represent the x, y components of
its surface normal and its depth information, respectively.
The tri-band image is then converted to a gray image which
we call a Normal-Enhanced Range Image (NERI). The
proposed segmentation method is divided into three steps.
First, the NC algorithm is applied to the NERI and
partitions the NERI into a number of segments. A
pre-specified number is needed in our current
implementation. Second, the least-square plane to the data
points in each segment is computed and the plane-fitting
statistics are used to label the segments as planar and
non-planar. Third, adjacent planar surfaces with the same
orientation are merged.
To simplify the description we only consider the
extraction of vertical and horizontal planes in 3D space. As
shown in Fig. 1 a vertical plane is defined as the one whose
normal direction is along the –Y axis (Fig. 1a) and a
horizontal plane is the one whose normal direction is along
the Z axis (Fig. 1b).
(a) (b)
Fig.1 Diagram of vertical and horizontal planes
For the N segments resulted from the NC method, we
need to identify those that best describe either vertical or
horizontal planes. To do this we perform a Least-Square
Plane (LSP) fit to the data points associated with each of the
N segment and calculate the normal direction and the fitting
error. Let the normal to the LSP be denoted by N
= (nx, ny,
nz) and the residual of the fit, also known as Plane Fit Error
(PFE), is computed by 1
P
k
kdP
 , where P
denotes the number of pixels in the segment and k
d is the
distance between the kth data point (xk, yk, zk) and the LSP.
The LSP is found by minimizing . The minimization can
be obtained by the Singular Value Decomposition (SVD)
method. First the following matrix is constructed using the
data points of the segment:
10 1010
102 0 0
20 2020
102 0 0
10 20 0
000
..
... ..
... ..
p
p
p
ppp
xx yy zz
x
xxx xx
xxy yzz
yy yy y y
zz z z z z
xxyyzz


















M
where 000
(, ,)
x
yz = (
111
111
,,
PPP
kkk
kkk
x
yz
PPP


) is the
centroid of the data points. Then the Eigen values
12
, , ...,
p

of M and their corresponding eigenvectors
are computed. It can be proven that N

equates the Eigen
vector corresponding to the minimum Eigen value
min 1 2
min( , , ..., )
p

and equates to min P
. The
deviation of the normal direction from Y and Z axes are
computed by 1
cos ( )
yy
n
 and 1
cos ( )
z
z
n
,
respectively. The value of the PFE determines whether the
segment forms a planar surface while the values of y
and
z
determine if the plane is vertical or a horizontal. To be
specific the data points in a segment whose PFE is
sufficiently small form a planar surface; and the planar
segment is vertical (horizontal) if the value of y
(
z
) is
sufficiently close to 0º.
Extracting an entire plane from the scene involves
merging of two or more planar segments. In our work,
merging is performed only if two segments are
neighboring. Figure 2 shows three typical configurations of
two close-by segments. The common boundary between
the two segments is highlighted in green. Any two
segments are considered as neighbors if there exists at least
two consecutive common points (i.e., they belong to both
segments and are continuous in space) on their boundaries.
Thus the segments in Fig. 2a do not qualify as neighbors
since they have a single common point whereas the two
segments in Fig. 2b or Fig. 2c are considered as neighbors.
(a) (b) (c)
Fig.2 Definition of neighboring segments: for simplicity the segments are
drawn as rectangles. (a) two non-neighboring segments, (b) and (c) two
neighboring segments.
Our proposed method for extracting vertical and
horizontal planes in a range image is as follows:
1) Construct the NERI and apply the NC algorithm to
partition the NERI into Nsegments i
c for 1,...,iN
.
2) The planar segments i
c for 1,...,iN that satisfy
i
are selected to form a set of planar segments
123
{ , , ,..., }
n
Ssss s, i.e., a segment with a PFE smaller
than
is taken as a planar surface. Here nN and
is
a suitable threshold.
3) Each j
s
S
is then labeled as vertical or horizontal
based on the normal direction of its LSP
4) A pair of vertical or horizontal segments is merged if
they are neighbors.
5) Terminate the process when all the neighbors are
merged.
IV. EXPERIMENTS AND RESULTS
We have validated our planar surface extraction method
through experiments in various indoor environments that
contain most commonly occurring structures. As
mentioned before our current method requires a
pre-specified number of clusters N. In all our experiments
we use N=100 that is bigger than the actual number of
planar segments each NERI contains. The reason for
choosing such a big N is to ensure a correct segmentation.
This can be demonstrated by the example in Fig. 3. The test
scene is shown in Fig. 3a and its NERI is depicted in Fig.
3c. Apparently the NERI has 5 segments that are
hand-labeled and shown in Fig. 3c. We now apply the NC
algorithm to Fig. 3c using the actual number of segments (N
= 5). The result is shown in Fig. 3d. We can observe in Fig.
3d that there is a misclassification in cluster 5 that contains
two regions with different brightness. Each of the two
regions represents a planar surface that is perpendicular to
one another in 3D space. To avoid such a misclassification
scenario we need to assign N a number that is much bigger
than the actual number of segments in a scene.
(a) (b)
(c) (d)
Fig. 3: Misclassification of the NC method using the exact number of
segments of a scene: (a) Actual Scene, (b) Raw range image of (a), (c)
NERI with labeled segments, (d) Segmentation results of the NC over the
NERI.
In all our segmentation results hereafter, we label a
vertical and a horizontal plane in blue and green,
respectively. In the first experiment we consider an indoor
scene with a stairway as depicted in Fig. 4a. The raw range
image of the SR-3000 is shown in Fig. 4b and the NERI
representation of the range data in Fig. 4c. The NC
algorithm partitions the NERI into 100 segments as shown
in Fig. 4d. Fig. 4e displays selected planar segments. A
token in blue indicates that the corresponding segment
belongs to a vertical plane and a token in green means that
the segment lies on a horizontal plane. Finally the extracted
planes are shown in Fig. 4f. We can see that the majority of
the planar surfaces are correctly extracted. We can also
observe in Fig. 4f that region P (marked in red) is
under-extracted because its adjacent regions (circled in
yellow) are misclassified. As we can see from Fig. 4e that
pixels inside region A or B are classified as in the same
segment by the NC algorithm. However, each of them
contains data points on different planar surfaces that are
perpendicular to one another in the 3D space. They are
considered as non-planar segments due to their large PFE’s.
It should be noted that the misclassification will not have
big impact in recognizing the stairway and guiding the
robot. This is because there are enough number of treads
and risers indentified and the misclassification occurred at a
location far away from the robot. We can also see that there
are minor misclassifications at the right or left end of some
extracted planar segments. This suggest for future research
effort that go beyond the scope of this paper. Fig. 4g
renders the segmentation result in a 3D point cloud where
the unclassified points are represented in black.
The 2nd experiment shows the plane extraction of a
hallway. The result is depicted in Fig. 5. We can see that the
floor, door and the wall regions have been properly
extracted. Fig. 6 displays the result of our experiment in the
lobby in the ETAS building at the University of Arkansas at
Little Rock. The result demonstrates a satisfactory planar
feature extraction of our proposed method.
V. DISCUSSION
We have seen in the previous section that some segments
(mainly in the boundary regions) fail to qualify as planar
ones due to misclassification in the initial clustering phase.
In this section we put forward a recursive approach that
may identify planar pixels from the non-planar segment and
hence can extract a plane to its entirety.
(a) (b)
(c) (d)
(e)
(f) (g)
Fig. 4 Segmentation of the scene with a stairway: (a) Actual Scene, (b)
Raw range image, (c) NERI of (b), (d) Initial clustering results from the
N
C, (e) Labeling of vertical and horizontal planar segments, (f)
Segmentation results after merging the homogeneous segments in (e), (g)
Segmented data shown in a 3D point cloud.
Consider Fig. 7a which is a magnified view of region A
in Fig.4f. Our objective is to extract the part of A that
belongs to the adjacent vertical plane P (Fig. 4f). To
achieve this we apply our proposed method recursively as
follows:
a) The NC algorithm is applied to A with N=2. This
breaks A into two sub-segments
b) For each sub-segment compute the PFE and normal of
its LSP.
c) A sub-segment is then merged with P or discarded
based on the criterion we set forth in Section IV.
d) If none of the sub-segments is merged with P, the NC
is again applied to A with N=N+1, i. e., repeat the
process with N=N+1.
In this example, we did not make a merger with N=2. But
with N=3 we obtained three sub-segments (C, D, E in Fig.
7a). Through steps b) and c), a larger segment (Fig. 7b) is
formed with C joining P. From this we can see that a finer
segmentation can be achieved by further splitting the
non-planar segments that are adjacent to the extracted
segments (blue and green segments in this case).
(a) (b)
Fig. 7: (a) Enlarged view of a misclassified cluster, (b) Segmentation result
after extracting planar region from (a).
Currently, the proposed method is implemented in
Matlab using the Normalized Cuts library [ 15 ]. As a
consequence it is not real-time. Efforts are being made to
translate the code entirely into C/C++. We expect that this
will significantly reduce the runtime.
VI. CONCLUSION AND FUTURE WORK
We have presented a method that may reliably extract the
vertical and horizontal planes from the range images
captured by the SwissRanger SR-3000. We use a split and
merge approach to achieve this. In the proposed method,
surface normal information is used to convert a range image
into a NERI for image enhancement. To deal with surface
normal errors caused by noise in range data we apply the
NC algorithm over a NERI to get homogenous segments.
They are merged to form larger segments (horizontal and
vertical planes) based on the LSP fitting data statistics. Our
method works efficiently without a prior knowledge of the
number of vertical or horizontal planes in a scene. We also
have also demonstrated that under-extraction of planes due
to misclassification of non-planar segments can be resolved
by further splitting the related segment and apply our
method to the sub-segments. We have validated the
method’s efficacy by real experiments in various indoor
environments. Although the method is intended for the
segmentation of flat surfaces, it can be adapted to handle
non-flat cases as well. One possible approach is to assume
that a non-flat surface comprises a number of planar
segments with small changes in their orientations
(normals). The merging of neighboring segments then takes
place based on the rate of change of their normal directions.
(a) (b)
(c) (d)
(e) (f)
Fig. 5 Segmentation of the scene with a hallway:
(a) Actual Scene, (b) Range image from SR-3000, (c) NERI of (b), (d)
Initial clustering results from NCCT, (e) Segmentation results after
merging clusters, (g) Extracted points in 3D.
(a) (b)
(c) (d)
(e) (f)
Fig. 6 Segmentation results of the scene with a Lobby:
(a) Actual Scene, (b) Range image from SR-3000, (c) NERI of (b), (d)
Initial clustering results from NCCT, (e) Segmentation results afte
r
merging clusters, (f) Extracted points in 3D.
It should be noted that we use a gray image for the NERI
in order to use the Normalized Cuts library. The mapping of
a tri-band image pixel to a NERI pixel is multiple-to-one. If
two non-parallel planes happen to have such orientations
that their pixels have the same brightness in theirs NERI
representations, the distinctness of the planes in the NERI
will be very small. This may potentially add difficulty to the
N-Cuts method. Fortunately, in this case the intersection of
the planes (a straight line segment) has a different normal
direction from both planes. This will likely help the N-Cuts
method locate correct planar segments. In case that the
N-Cuts method fails, we will have to use the tri-band color
image representation for NERIs and develop new N-Cuts
method to segment the color NERIs. We will carry out
more experiment to test this point in our future work.
Another direction for future research is to develop method
that may find the optimal number of clusters for the N-Cuts
method. This might improve the execution time of our
method in both splitting and merging phases.
The method proposed in this paper can be employed by a
mobile robot for autonomous navigation in indoor
environments.
REFERENCES
[1] R. Unnikrishnan and M. Hebert, “Robust extraction of multiple
structures from non-uniformly sampled data,” in Proc. IEEE/RSJ
International Conference on Intelligent Robots and Systems, 2003,
pp. 1322-1329.
[2] R. Triebel, W. Burgard, and F. Dellaert, “Using hierarchical EM to
extract planes from 3d range scans,” in Proc. IEEE International
Conference on Robotics and Automation, 2005, pp. 4437-4442.
[3] V. Don and H. Maarten Uijt de, “Near real-time extraction of planar
features from 3d flash-ladar video frames,” in Proc. SPIE, vol. 6977
of Optical Pattern Recognition, pp. 69770N-69770N-12, 2008.
[4] I. Stamos and P. E. Allen, “3-d model construction using range and
image data,” in Proc. IEEE International Conference on Computer
Vision and Pattern Recognition, 2000, pp. 531-536.
[5] A. Bab-Hadiashar and N. Gheissari, “Range image segmentation
using surface selection criterion,” IEEE Transactions on Image
Processing, vol. 15, no. 7, pp. 2006-2018, 2006.
[6] A. D. Sappa, “Automatic extraction of planar projections from
panoramic range images,” in Proc. 2nd International Symposium on
3D Data Processing, Visualization and Transmission, 2004, pp.
231-234.
[7] T. Oggier, et al., “An all-solid-state optical range camera for 3D
real-time imaging with sub-centimeter depth resolution,” in Proc.
SPIE, 2003, vol. 5249, pp. 534-545.
[8] T. Oggier, B. Büttgen,F. Lustenberger, “SwissRanger SR3000 and
first experiences based on miniaturized 3D-TOF Cameras,” Swiss
Center for Electronics and Microtechnology (CSEM) Technical
Report, 2005.
[9] C. Ye and J. Borenstein, “Characterization of a 2-D laser scanner for
mobile robot obstacle negotiation,” in Proc. IEEE International
Conference on Robotics and Automation, 2002, pp. 2512-2518.
[10] S. A. Guomundsson, H. Aanaes, and R. Larsen, “Environmental
effects on measurement uncertainties of time-of-flight cameras,” in
International Symposium on Signals, Circuits and Systems, 2007, pp.
1-4.
[11] K. Young Min, D. Chan, C. Theobalt, and S. Thrun, “Design and
calibration of a multi-view TOF sensor fusion system,” in Proc.
IEEE Computer Society Conference on Computer Vision and Pattern
Recognition Workshops, 2008, pp. 1-7.
[12] G. M. Hegde and C.Ye, “Swissranger sr-3000 range images
enhancement by a singular value decomposition filter,” in Proc.
IEEE International Conference on Information and Automation,
2008, pp. 1626-1631.
[13] S. Jianbo and J. Malik, “Normalized cuts and image segmentation,
IEEE Transactions on Pattern Analysis and Machine Intelligence,
vol. 22, no. 8, pp. 888-905, 2000.
[14] K. Pulli, “Vision methods for an autonomous machine based on
range imaging,” Master’s Thesis, University of Oulu.
[15] http://www.cis.upenn.edu/~jshi/software/
... It assumes that edges are the borders between different planar surfaces that have to be segmented separately. A region-growing algorithm [13]- [17] selects some seed points in range data and grows them into regions based on the homogeneity of local features (see surface normal, point-toplane distance, etc.). Xiao et al. proposed a cached-octree region-growing (CORG) algorithm [14] for extracting planar segments from a 3-D point set. ...
... Since the method stops region-growing merely based on distance information, over-extraction may occur at the intersection of two planes. The invariants of this pointbased region-growing method are the grid-based [16] and super-pixel-(SP-)based [17] algorithms. The grid-based regiongrowing algorithm [16] divides a range image into a number of small grids whose attributes are computed and used for region growth. ...
Data
Full-text available
... They propose additionally a final refinement step that involves the alignment of corresponding surface normals leading to improved 3D scene maps computed at frame rate. The normal of the extracted planes is also used by Hedge and Ye [14] to detect badly conditioned plane detection, as horizontal planes in a staircase. Also Pathak et al. [15] have reported the use of ToF to extract planes for 3D mapping. ...
... This is applicable when considering a simple shaped robot, i.e. one that can be approximated by a cylinder, and it Obstacle avoidance in static env. 3D at high rate SR2 (depth) May et al. [11,12] 3D mapping 3D at high rate/No required Pan-Tilt SR2 (depth) May et al. [13] Pose estimation/3D mapping Registered depth-intensity SR3 (depth + intensity) Hedge and Ye [14] Planar feature 3D mapping 3D at high rate/No required Pan-Tilt SR3 Ohno et al. [16] 3D mapping 3D at high rate SR2 Stipes et al. [17] 3D mapping / Point selection Registered depth-intensity SR3 May et al. [18] 3D mapping/SLAM 3D at high rate SR3 Gemeiner et al. [19] Corner filtering Registered depth-intensity SR3 (depth + intensity) Thielemann et al. [20] Navigation in pipelines 3D allow geometric primitives search SR3 Sheh et al. [21] Navigation in hard env. 3D at high rate SR3 + inertial Swadzba et al. [22] 3D mapping in dynamic env. ...
Article
Full-text available
ToF cameras are now a mature technology that is widely being adopted to provide sensory input to robotic applications. Depending on the nature of the objects to be perceived and the viewing distance, we distinguish two groups of applications: those requiring to capture the whole scene and those centered on an object. It will be demonstrated that it is in this last group of applications, in which the robot has to locate and possibly manipulate an object, where the distinctive characteristics of ToF cameras can be better exploited. After presenting the physical sensor features and the calibration requirements of such cameras, we review some representative works highlighting for each one which of the distinctive ToF characteristics have been more essential. Even if at low resolution, the acquisition of 3D images at frame-rate is one of the most important features, as it enables quick background/foreground segmentation. A common use is in combination with classical color cameras. We present three developed applications, using a mobile robot and a robotic arm, to exemplify with real images some of the stated advantages.
... Plane extraction methods in the literature may be broadly classified into two categories: region-growing/ clustering algorithms and model fitting methods. A region-growing algorithm [10], [11], [12] selects some seed points in range data and grows them into regions based on the homogeneity of local features (e.g., surface normal, point-to-plane distance, etc.). The method may be grid based [11] or Super-Pixel (SP) based [12]. ...
... A region-growing algorithm [10], [11], [12] selects some seed points in range data and grows them into regions based on the homogeneity of local features (e.g., surface normal, point-to-plane distance, etc.). The method may be grid based [11] or Super-Pixel (SP) based [12]. The grid-based region-growing algorithm [11] divides a range image into lots of small grids and the attribute of each grid is computed and used for region growth. ...
Article
Full-text available
This paper presents a new plane extraction (PE) method based on the random sample consensus (RANSAC) approach. The generic RANSAC-based PE algorithm may over-extract a plane, and it may fail in case of a multistep scene where the RANSAC procedure results in multiple inlier patches that form a slant plane straddling the steps. The CC-RANSAC PE algorithm successfully overcomes the latter limitation if the inlier patches are separate. However, it fails if the inlier patches are connected. A typical scenario is a stairway with a stair wall where the RANSAC plane-fitting procedure results in inliers patches in the tread, riser, and stair wall planes. They connect together and form a plane. The proposed method, called normal-coherence CC-RANSAC (NCC-RANSAC), performs a normal coherence check to all data points of the inlier patches and removes the data points whose normal directions are contradictory to that of the fitted plane. This process results in separate inlier patches, each of which is treated as a candidate plane. A recursive plane clustering process is then executed to grow each of the candidate planes until all planes are extracted in their entireties. The RANSAC plane-fitting and the recursive plane clustering processes are repeated until no more planes are found. A probabilistic model is introduced to predict the success probability of the NCC-RANSAC algorithm and validated with real data of a 3-D time-of-flight camera-SwissRanger SR4000. Experimental results demonstrate that the proposed method extracts more accurate planes with less computational time than the existing RANSAC-based methods.
... Plane extraction methods in the literature may be broadly classified into two categories: region-growing/ clustering algorithms and model fitting methods. A region-growing algorithm [10], [11], [12] selects some seed points in range data and grows them into regions based on the homogeneity of local features (e.g., surface normal, point-to-plane distance, etc.). The method may be grid based [11] or Super-Pixel (SP) based [12]. ...
... A region-growing algorithm [10], [11], [12] selects some seed points in range data and grows them into regions based on the homogeneity of local features (e.g., surface normal, point-to-plane distance, etc.). The method may be grid based [11] or Super-Pixel (SP) based [12]. The grid-based region-growing algorithm [11] divides a range image into lots of small grids and the attribute of each grid is computed and used for region growth. ...
Conference Paper
Full-text available
This paper presents a new RANSAC based method for extracting planes from 3D range data. The generic RANSAC Plane Extranction (PE) method may over-extract a plane. It may fail in the case of a multi-step scene where the RANSAC process results in multiple inlier patches that form a slant plane straddling the steps. The CC-RANSAC algorithm overcomes the latter limitation if the inlier patches are separate. However, it fails when the inlier patches are connected. A typical scenario is a stairway with a stairwall. In this case the RANSAC plane-fitting produces inlier patches (in the tread, riser and stairwall planes) that connect together to form a plane. The proposed method, called NCC-RANSAC, performs a normal-coherence check to all data points of the inlier patches and removes those points whose normal directions are contradictory to that of the fitted plane. This procedure results in a set of separate inlier patches, each of which is then extended into a plane in its entirety by a recursive plane clustering process. The RANSAC plane-fitting and recursive plane clustering processes are repeated until no more planes are found. A probabilistic model is introduced to predict the success probability of the NCC-RANSAC method and validated with the real data of a 3D cameraSwissRanger SR4000. Experimental results demonstrate that the proposed method extracts more accurate planes with less computational time than the existing RANSAC based methods. The proposed method is intended to be used by a robotic navigational device for the visually impaired for object detection/recognition in indoor environments.
... Compared with color images, range images are less sensitive to changes in the environment illumination, object color or texture. Existing algorithms for range image segmentation focus mainly on segmenting planar surfaces or regular curved surfaces [12][13][14][15][16][17]. The principle of these algorithms is to divide the image into closed regions with similar surface functions. ...
Article
Full-text available
In the factory of the future, most of the operations will be done by autonomous robots that need visual feedback to move around the working space avoiding obstacles, to work collaboratively with humans, to identify and locate the working parts, to complete the information provided by other sensors to improve their positioning accuracy, etc. Different vision techniques, such as photogrammetry, stereo vision, structured light, time of flight and laser triangulation, among others, are widely used for inspection and quality control processes in the industry and now for robot guidance. Choosing which type of vision system to use is highly dependent on the parts that need to be located or measured. Thus, in this paper a comparative review of different machine vision techniques for robot guidance is presented. This work analyzes accuracy, range and weight of the sensors, safety, processing time and environmental influences. Researchers and developers can take it as a background information for their future works.
Article
We present an odometry-free three-dimensional (3D) point cloud registration strategy for outdoor environments based on area attributed planar patches. The approach is split into three steps. The first step is to segment each point cloud into planar segments, utilizing a cached-octree region growing algorithm, which does not require the 2.5D image-like structure of organized point clouds. The second step is to calculate the area of each segment based on small local faces inspired by the idea of surface integrals. The third step is to find segment correspondences between overlapping point clouds using a search algorithm, and compute the transformation from determined correspondences. The transformation is searched globally so as to maximize a spherical correlation-like metric by enumerating solutions derived from potential segment correspondences. The novelty of this step is that only the area and plane parameters of each segment are employed, and no prior pose estimation from other sensors is required. Four datasets have been used to evaluate the proposed approach, three of which are publicly available and one that stems from our custom-built platform. Based on these datasets, the following evaluations have been done: segmentation speed benchmarking, segment area calculation accuracy and speed benchmarking, processing data acquired by scanners with different fields of view, comparison with the iterative closest point algorithm, robustness with respect to occlusions and partial observations, and registration accuracy compared to ground truth. Experimental results confirm that the approach offers an alternative to state-of-the-art algorithms in plane-rich environments.
Conference Paper
Three-dimensional (3D) data processing has recently acquired greater importance in solving complex tasks such as object recognition, environment modeling, and robotic mapping and localization. Since using raw 3D data without preprocessing is very time-consuming, extraction of geometric features that describe the environment concisely is essential. A plane is a suitable geometric feature due to its richness and simplicity of extraction. This paper presents an online incremental plane extraction method using line segments. Since our system is based on a nodding laser scanner, we exploit the incremental nature of data acquisition in which physical rotation and algorithm implementation are conducted in parallel. In contrast to other plane extraction methods, line segments defined by two end points become supporting elements that comprise a plane, so we need not handle all the scan points once the line segments are extracted from each scan slice. This reduces the algorithms complexity and the computation time. Experimental validation and comparison with state of the art method were conducted using tens of complete scan data sets acquired from a typical indoor environment.
Conference Paper
Full-text available
Studies have been conducted to detect object planes in 3D using 3D distant measurement camera such as Swiss Ranger SR-4000. In those studies, first, objects in the gray scale range image from the camera are emphasized. Then, emphasized image is grouped regarding the depth information and parameters for planes are calculated using least-square-method. Here, emphasis is sensitive enough, but ambiguity of gray scale range image causes error detections. Considering this problem, a new plane detection method is proposed using 3D Hough transform. According to the experiments, new proposal is effective and it showed better results compared to the previous methods.
Article
This paper focuses on three-dimensional (3D) point cloud plane segmentation. Two complementary strategies are proposed for different environments, i.e., a subwindow-based region growing (SBRG) algorithm for structured environments, and a hybrid region growing (HRG) algorithm for unstructured environments. The point cloud is decomposed into subwindows first, using the points’ neighborhood information when they are scanned by the laser range finder (LRF). Then, the subwindows are classified as planar or nonplanar based on their shape. Afterwards, only planar subwindows are employed in the former algorithm, whereas both kinds of subwindows are used in the latter. In the growing phase, planar subwindows are investigated directly (in both algorithms), while each point in nonplanar subwindows is investigated separately (only in HRG). During region growing, plane parameters are computed incrementally when a subwindow or a point is added to the growing region. This incremental methodology makes the plane segmentation fast. The algorithms have been evaluated using real-world datasets from both structured and unstructured environments. Furthermore, they have been benchmarked against a state-of-the-art point-based region growing (PBRG) algorithm with regard to segmentation speed. According to the results, SBRG is 4 and 9 times faster than PBRG when the subwindow size is set to 3×3 and 4×4 respectively; HRG is 4 times faster than PBRG when the subwindow size is set to 4×4. Open-source code for this paper is available at https://github.com/junhaoxiao/TAMS-Planar-Surface-Based-Perception.git.
Article
Full-text available
The latest development in range imaging 3D-cameras is presented, the so-called SwissRanger3000 (SR3000). The SR3000 includes a modularly built electronics stack. This modular stack allows for easy adjustments to the camera for specific customer requirements. The core of the SR3000 is comprised of a background suppressing 3D-sensor with QCIF resolution. Furthermore, the first evaluation results of field-tests using a 3D-camera developed at CSEM are presented. The outcome of the evaluation phase for the automotive industry is discussed. A new development using the camera as detecting device for a virtual interaction on a large screen has successfully been implemented. Finally, the latest breakthrough in distance accuracy enables the 3D-TOF camera to enter the biometrics markets.
Article
Full-text available
This paper describes a novel method used to extract planar surfaces from a stream of 3D images in near real-time. The method currently operates on 3D images acquired from a MESA SwissRanger SR-3000 infrared time of flight camera, which operates in a manner similar to flash-ladar sensors; the camera provides the user with range and intensity value for each pixel in the 176 by 144 image frame. After application of the camera calibration the range measurement associated with each pixel can be converted to a Cartesian coordinate. First, the proposed method splits the focal image plane into sub-images or sub-windows. The method then operates in the 3D parameter space to find an estimate of the planar equation best describing the point cloud associated with the window pixels and to compute a metric that defines how well the sub-window points fit to the planar estimate. The best fit sub-window is then used as an initialization to one of two investigated methods: a parameter based search technique and cluster validation using histogram thresholding to extract the entire plane from the 3D image frame. Once a plane is extracted, a feature vector describing that plane along with their describing statistics can be generated. These feature vectors can then be used to enable feature-based navigation. The paper will fully describe the feature extraction method and will provide application results of this method to extract features from indoor D video data obtained with the MESA SwissRanger SR-3000. Also provided is a brief overview of the generation of feature statistics and their importance.
Article
Full-text available
A new miniaturized camera system that is capable of 3-dimensional imaging in real-time is presented. The compact imaging device is able to entirely capture its environment in all three spatial dimensions. It reliably and simultaneously delivers intensity data as well as range information on the objects and persons in the scene. The depth measurement is based on the time-of-flight (TOF) principle. A custom solid-state image sensor allows the parallel measurement of the phase, offset and amplitude of a radio frequency (RF) modulated light field that is emitted by the system and reflected back by the camera surroundings without requiring any mechanical scanning parts. In this paper, the theoretical background of the implemented TOF principle is presented, together with the technological requirements and detailed practical implementation issues of such a distance measuring system. Furthermore, the schematic overview of the complete 3D-camera system is provided. The experimental test results are presented and discussed. The present camera system can achieve sub-centimeter depth resolution for a wide range of operating conditions. A miniaturized version of such a 3D-solid-state camera, the SwissRanger TM 2, is presented as an example, illustrating the possibility of manufacturing compact, robust and cost effective ranging camera products for 3D imaging in real-time.
Conference Paper
Full-text available
Recently, the acquisition of three-dimensional maps has become more and more popular. This is motivated by the fact that robots act in the three-dimensional world and several tasks such as path planning or localizing objects can be carried out more reliable using three-dimensional representations. In this paper we consider the problem of extracting planes from three-dimensional range data. In contrast to previous approaches our algorithm uses a hierarchical variant of the popular Expectation Maximization (EM) algorithm [1] to simultaneously learn the main directions of the planar structures. These main directions are then used to correct the position and orientation of planes. In practical experiments carried out with real data and in simulations we demonstrate that our algorithm can accurately extract planes and their orientation from range data.
Conference Paper
Full-text available
This paper presents a preliminary study on the enhancement of the SwissRanger SR-3000psilas range images by a singular value decomposition filtering method. The image enhancement is performed by converting a conventional range image into an enhanced range image where each pixelpsilas intensity embodies the surface normal and depth information of the corresponding pixel in the original image. This representation of range image makes an objectpsilas edges distinctive. But it corrupts the objectpsilas surfaces due to the noise in range data. We propose a filtering method based on the Singular Value Decomposition to alleviate the surface corruption and preserve the edges. The efficacy of the proposed method is validated by numerous experiments in various environments.
Conference Paper
Full-text available
This paper presents a characterization study of the Sick LMS 200 laser scanner. A number of parameters, such as operation time, data transfer rate, target surface properties, as well as the incidence angle, which may potentially affect the sensing performance, are investigated. A probabilistic range measurement model is built based on the experimental results. The paper also analyzes the mixed pixels problem of the scanner.
Conference Paper
Full-text available
In this paper the effect the environment has on the SwissRanger SR3000 time-of-flight camera is investigated. The accuracy of this camera is highly affected by the scene it is pointed at: such as the reflective properties, color and gloss. Also the complexity of the scene has considerable effects on the accuracy. To mention a few: The angle of the objects to the emitted light and the scattering effects of near objects. In this paper a general overview of known such inaccuracy factors are described, followed by experiments illustrating the additional uncertainty factors. Specifically we give a better description of how a surface color intensity influences the depth measurement, and illustrate how multiple reflections influence the resulting depth measurement.
Article
This paper describes the design and calibration of a sys-tem that enables simultaneous recording of dynamic scenes with multiple high-resolution video and low-resolution Swissranger time-of-flight (TOF) depth cameras. The sys-tem shall serve as a testbed for the development of new algorithms for high-quality multi-view dynamic scene re-construction and 3D video. The paper also provides a detailed analysis of random and systematic depth camera noise which is important for reliable fusion of video and depth data. Finally, the paper describes how to compensate systematic depth errors and calibrate all dynamic depth and video data into a common frame.
Article
We propose a novel approach for solving the perceptual grouping problem in vision. Rather than focusing on local features and their consistencies in the image data, our approach aims at extracting the global impression of an image. We treat image segmentation as a graph partitioning problem and propose a novel global criterion, the normalized cut, for segmenting the graph. The normalized cut criterion measures both the total dissimilarity between the different groups as well as the total similarity within the groups. We show that an efficient computational technique based on a generalized eigenvalue problem can be used to optimize this criterion. We have applied this approach to segmenting static images, as well as motion sequences, and found the results to be very encouraging.