Conference PaperPDF Available

Extraction of Planar Features from Swissranger SR-3000 Range Images by a Clustering Method Using Normalized Cuts

November 2009

November 2009

DOI:10.1109/IROS.2009.5353952

Source
IEEE Xplore

Conference: Intelligent Robots and Systems, 2009. IROS 2009. IEEE/RSJ International Conference on

Authors:

Cang Ye

Virginia Commonwealth University

This paper describes a new approach to extract planar features from 3D range data captured by a range imaging sensor-the SwissRanger SR-3000. The focus of this work is to segment vertical and horizontal planes from range images of indoor environments. The method first enhances a range image by using the surface normal information. It then partitions the Normal Enhanced Range Images (NERI) into a number of segments using the Normalized-Cuts (N-Cuts) algorithm. A least-square plane is fit to each segment and the fitting error is used to determine if the segment is planar or not. From the resulting planar segments, each vertical or horizontal segment is labeled based on the normal of its least-square plane. A pair of vertical or horizontal segments is merged if they are neighbors. Through this region growing process, the vertical and horizontal planes are extracted from the range data. The proposed method has a myriad of applications in navigating mobile robots in indoor environments.

Content uploaded by Cang Ye

Content may be subject to copyright.

Content uploaded by Cang Ye

Content may be subject to copyright.



Abstract—This paper describes a new approach to extract

planar features from 3D range data captured by a range

imaging sensorthe SwissRanger SR-3000. The focus of this

work is to segment vertical and horizontal planes from range

images of indoor environments. The method first enhances a

range image by using the surface normal information. It then

partitions the Normal Enhanced Range Images (NERI) into a

number of segments using the Normalized-Cuts (N-Cuts)

algorithm. A least-square plane is fit to each segment and the

fitting error is used to determine if the segment is planar or

not. From the resulting planar segments, each vertical or

horizontal segment is labeled based on the normal of its

least-square plane. A pair of vertical or horizontal segments is

merged if they are neighbors. Through this region growing

process, the vertical and horizontal planes are extracted from

the range data. The proposed method has a myriad of

applications in navigating mobile robots in indoor

environments.

I. INTRODUCTION

NDOOR environments commonly consists of regular

structures such as, stairways, hallways, and doorways,

etc.. To operate efficiently in such conditions it is import

for a mobile robot to identify these structures and deal with

them. For instance, a tracked robot may use the information

to traverse steps and stairways. Extracting and recognizing

these structures is also useful in building a symbolic map of

the environment. A robot may use these structures as

landmarks for navigation and localize itself. Since these

indoor structures are often constituted by planar surfaces,

efficient planar feature extraction becomes an essential

capability for the robot. Pattern recognition algorithm built

on the plane extraction method may allow the robot to

group the planar surfaces into structures and identify them

based on their geometric constituent. For instance, a

stairway can be characterized by an occurrence of alternate

horizontal (treads) and vertical (raisers) planes and a floor

can be characterized by a large horizontal plane. Also,

extracting planar segments is an important problem in

range data processing, and it serves a number of purposes.

First, the information about where two planes intersect in

3D space can be used to extract prominent linear features

such as corners in a room. These features are important for

registering multiple scans of range data or to register range

data with 2D images.

Researchers have addressed the problem of planar

feature extraction from range data either in the original

Manuscript received March 1, 2009. This work was supported in part

by NASA and the Arkansas Space Grant Consortium under grants

UALR18800; by the NASA EPSCoR RID Award, and a matching fund

from the Arkansas Science and Technology Authority.

C. Ye is with the Department of Applied Science, University of

Arkansas at Little Rock, Little Rock, AR 72204, USA (phone:

501-683-7284; fax: 501-569-8020; e-mail: cxye@ualr.edu).

G. Hegde is with the same department (e-mail: gmhegde@ualr.edu).

input domain (3D point cloud) [1,2,3,4] or by representing

the range data as an image [5,6]. Venable and Uijit de Haag

[3] propose a so-called histogramming method for planar

surface extraction for the SR-3000. The method first

divides a range image into a number of sub-images equally.

It then fits a least-square plane to the set of 3D data points

belonging to each sub-image, and the plane with the

smallest fitting error is chosen as a candidate planar feature.

A histogram of the distances d from the rest of the data

points to the candidate plane is computed. Data points that

are closely located at the surrounding of d=0 and d=D in the

histogram are classify as points in the candidate plane and

points in a parallel plane, respectively. The advantage of

the method is its real-time performance. The limitation is

that it can not be applied to a scenario where planes have

multiple orientations (e.g., perpendicular planes). In

addition, the set of data points in a sub-image with the

minimum fitting error does not necessarily form a planar

surface. Stamos and Allen’s method [4] identifies planar

structures from 3D range data of a precision laser scanner

by dividing the data into k×k patches and merging the

planar patches based on plane-fitting statistics. A patch is

classified as a locally planar patch if the plane-fitting error

is below a threshold. Two locally planar patches are

considered to be in a same planar surface if they have

similar orientations and are close in 3D space. The

plane-fitting based classification method is sensitive to the

threshold value. In addition it is not easy to determine an

appropriate patch sizea trade-off between computational

cost and the granularity of data segmentation.

In this work we use the SwissRanger SR-3000 imaging

sensor [7,8] as we are investigating the plane extraction

problem for possible application of navigating a small

mobile robot in indoor environments. In this case the

SR-3000 is advantageous over a LADAR. It has a much

higher data throughput25344 points per frame and up to

50 frames per second, and is much smaller in size (50×48

×65 mm3). Also the SR-3000 works well in featureless

environments which is a big advantage over a stereovision

system. However, the SR-3000’s sensing technology is

nascent and its range data has relatively large measurement

errors (much bigger than that of a LADAR [9]) due to

random noise (e.g., thermal noise, photon short noise) and

environmental factors (e.g., surface reflectivity). Previous

research efforts [ 10 , 11 ] have demonstrated a proper

calibration process may reduce the errors in the SR-3000’s

range data to certain extent. However, it can not eliminate

the errors induced by random noise. In [12] the authors of

this paper developed a Singular Value Decomposition

(SVD) filter to deal with the noise in the Normal Enhanced

Range Image (NERI) of the SR-3000. The SVD filter

demonstrates some success in smoothing the surface.

However, there is still certain amount of corruption in the

Extraction of Planar Features from Swissranger SR-3000 Range

Images by a Clustering Method Using Normalized Cuts

GuruPrasad M. Hegde and Cang Ye, Senior Member, IEEE

NERI. In such a case, a pixel-by-pixel region growing

method can not perform segmentation very well as it is

susceptible to disturbance in local features. It is required to

use a global criterion to segment a NERI. The criterion

must take into account both dissimilarity between the

segments as well as the total similarity within the segments

(i.e., among image pixels).

In this paper we present a new range image segmentation

method based on the Normalized Cuts (NC) method [13].

The NC method was originally proposed for the

segmentation of intensity images and it uses the total

dissimilarity between groups and the total similarity within

the groups to partition an image. It may result in

inappropriate grouping of pixels in case that an object does

not have distinctive dissimilarity from the background.

This problem may be alleviated in segmenting a range

image since additional metrics, such as the surface normal

of the least-square plane to the data points of a segment and

the fitting error, may be used to evaluate the correctness of

the segmentation.

The reminder of this paper is organized as follows: In the

following section we briefly describe briefly the NC

method. In section III, we explain our proposed method for

extracting planar features from range data. In Section IV,

we present experimental results followed by section V

where we discuss a recursive method to extract planar

pixels from misclassified clusters. The paper is concluded

in section VI where we discuss some directions for our

future work.

II. IMAGE SEGMENTATION USING NORMALIZED CUTS

A. Image segmentation as a graph partitioning problem

Image segmentation can be modeled as graph

partitioning problem. An image is represented as a

weighted undirected graph (, )GVE wherein each pixel

is considered as a node i

V and an edge ij

is formed

between each pair of nodes. The weight for each edge is

recorded in a Pixel Similarity Matrix (PSM) calculated as a

function of similarity between each pair of nodes. In

partitioning an image into various disjoint sets of pixels or

segments 123

, , ,..., m

VVV V , the goal is to maximize the

similarity of nodes in a subset i

V and minimize the

similarity across different sets

V. For the NC algorithm

the optimal bipartition of a graph into two sub-graphs A and

B is the one that minimizes the Ncut value given by:

),(

),( VBassoc

BAcut

VAassoc

BAcut

BANcut  , (1)

where

(,)

(,) (,)

uAvB

cut A B w u v



 is the dissimilarity

between A and B, and (, )wi j is the weight calculated as a

function of the similarity between nodes i and

(, )assoc A V is the total connection from nodes in

to all

nodes in V. (, )assoc B V is defined similarly. From (1) we

can see that a high similarity among nodes in A and a low

similarity across different sets A and B can be maintained

by the minimization process. Given a partition of nodes that

separates a graph Vinto two sets

and B, let

be an

N= | |Vdimensional indicator vector, i

= 1 if node i is

and -1, otherwise. Let ( , )

dwij be the total

connection from node i to all other nodes. With the above

definition, (, )Ncut A B in (1) can be calculated. According

to [13] an approximate discrete solution to minimize

(, )Ncut A B can be obtained by solving the following

equation:

Dyy

yWDy

xNcut T

)(

min)(min 

, (2)

where 12

( , ,..., ), ( , ),

diag d d d d w i j

[],

Ww

and

(1 ) (1 )

yx xdd



 



. If 

 ( is a set

of real numbers), then (2) can be minimized by solving the

following generalized Eigen value system:

DyyWD







)( (3)

B. Grouping Algorithm

The grouping of pixels in an image I consists of the

following steps:

a) Consider image

I as an undirected graph (, )GVE

and construct a PSM. As stated before, each element of

the PSM is the weight of edge (, )wi j and is calculated

2() ( ) 2

|| ||

|| ( ) ( ) ||

(, ) exp *exp ij

Fi F j

wi j





 if

|| ( ) ( ) ||Xi X j r



 and (, ) 0wi j , otherwise. Here,

()Xi is the spatial location of node i,() ()Fi Ii is the

brightness value of pixel i. It is noted that (, ) 0wi j



for any pair of nodes ,ij that is greater than r pixels

apart. The reason for calculating (, )wi j in such a

manner is substantiated by the following argument: any

two pixels that have similar brightness value and are

spatially nearer belong to the same object more possibly

than two pixels with different brightness values and are

distant from each other.

b) Solve (3) for the Eigenvectors with the smallest Eigen

values.

c) Use the Eigen vector with the second smallest Eigen

value to bipartition the image by finding the splitting

points such that its Ncut value is minimized.

d) Recursively re-partition the segments (go to step a)

e) Exit if Ncut value for every segment is over some

specified threshold.

III. THE PROPOSED METHOD

In this work we adopt the method in [14] for range image

enhancement. The authors of this paper demonstrate in [12]

that the use of surface normal in the SR-3000’s range

images make the surfaces and edges of an object more

distinct. We first construct a tri-band color image where

each pixel’s RGB values represent the x, y components of

its surface normal and its depth information, respectively.

The tri-band image is then converted to a gray image which

we call a Normal-Enhanced Range Image (NERI). The

proposed segmentation method is divided into three steps.

First, the NC algorithm is applied to the NERI and

partitions the NERI into a number of segments. A

pre-specified number is needed in our current

implementation. Second, the least-square plane to the data

points in each segment is computed and the plane-fitting

statistics are used to label the segments as planar and

non-planar. Third, adjacent planar surfaces with the same

orientation are merged.

To simplify the description we only consider the

extraction of vertical and horizontal planes in 3D space. As

shown in Fig. 1 a vertical plane is defined as the one whose

normal direction is along the –Y axis (Fig. 1a) and a

horizontal plane is the one whose normal direction is along

the Z axis (Fig. 1b).

(a) (b)

Fig.1 Diagram of vertical and horizontal planes

For the N segments resulted from the NC method, we

need to identify those that best describe either vertical or

horizontal planes. To do this we perform a Least-Square

Plane (LSP) fit to the data points associated with each of the

N segment and calculate the normal direction and the fitting

error. Let the normal to the LSP be denoted by N





= (nx, ny,

nz) and the residual of the fit, also known as Plane Fit Error

(PFE), is computed by 1

kdP



 , where P

denotes the number of pixels in the segment and k

d is the

distance between the kth data point (xk, yk, zk) and the LSP.

The LSP is found by minimizing . The minimization can

be obtained by the Singular Value Decomposition (SVD)

method. First the following matrix is constructed using the

data points of the segment:

10 1010

102 0 0

20 2020

102 0 0

10 20 0

000

... ..

ppp

xx yy zz

xxx xx

xxy yzz

yy yy y y

zz z z z z

xxyyzz







 









 











where 000

(, ,)

yz = (

111

PPP

kkk

PPP





) is the

centroid of the data points. Then the Eigen values

, , ...,





of M and their corresponding eigenvectors

are computed. It can be proven that N



equates the Eigen

vector corresponding to the minimum Eigen value

min 1 2

min( , , ..., )



 

 and  equates to min P



. The

deviation of the normal direction from Y and Z axes are

computed by 1

cos ( )





 and 1

cos ( )





,

respectively. The value of the PFE determines whether the

segment forms a planar surface while the values of y



and



determine if the plane is vertical or a horizontal. To be

specific the data points in a segment whose PFE is

sufficiently small form a planar surface; and the planar

segment is vertical (horizontal) if the value of y



(



) is

sufficiently close to 0º.

Extracting an entire plane from the scene involves

merging of two or more planar segments. In our work,

merging is performed only if two segments are

neighboring. Figure 2 shows three typical configurations of

two close-by segments. The common boundary between

the two segments is highlighted in green. Any two

segments are considered as neighbors if there exists at least

two consecutive common points (i.e., they belong to both

segments and are continuous in space) on their boundaries.

Thus the segments in Fig. 2a do not qualify as neighbors

since they have a single common point whereas the two

segments in Fig. 2b or Fig. 2c are considered as neighbors.

(a) (b) (c)

Fig.2 Definition of neighboring segments: for simplicity the segments are

drawn as rectangles. (a) two non-neighboring segments, (b) and (c) two

neighboring segments.

Our proposed method for extracting vertical and

horizontal planes in a range image is as follows:

1) Construct the NERI and apply the NC algorithm to

partition the NERI into Nsegments i

c for 1,...,iN



2) The planar segments i

c for 1,...,iN that satisfy





 are selected to form a set of planar segments

123

{ , , ,..., }

Ssss s, i.e., a segment with a PFE smaller

than



is taken as a planar surface. Here nN and



a suitable threshold.

3) Each j



is then labeled as vertical or horizontal

based on the normal direction of its LSP

4) A pair of vertical or horizontal segments is merged if

they are neighbors.

5) Terminate the process when all the neighbors are

merged.

IV. EXPERIMENTS AND RESULTS

We have validated our planar surface extraction method

through experiments in various indoor environments that

contain most commonly occurring structures. As

mentioned before our current method requires a

pre-specified number of clusters N. In all our experiments

we use N=100 that is bigger than the actual number of

planar segments each NERI contains. The reason for

choosing such a big N is to ensure a correct segmentation.

This can be demonstrated by the example in Fig. 3. The test

scene is shown in Fig. 3a and its NERI is depicted in Fig.

3c. Apparently the NERI has 5 segments that are

hand-labeled and shown in Fig. 3c. We now apply the NC

algorithm to Fig. 3c using the actual number of segments (N

= 5). The result is shown in Fig. 3d. We can observe in Fig.

3d that there is a misclassification in cluster 5 that contains

two regions with different brightness. Each of the two

regions represents a planar surface that is perpendicular to

one another in 3D space. To avoid such a misclassification

scenario we need to assign N a number that is much bigger

than the actual number of segments in a scene.

(a) (b)

Fig. 3: Misclassification of the NC method using the exact number of

segments of a scene: (a) Actual Scene, (b) Raw range image of (a), (c)

NERI with labeled segments, (d) Segmentation results of the NC over the

NERI.

In all our segmentation results hereafter, we label a

vertical and a horizontal plane in blue and green,

respectively. In the first experiment we consider an indoor

scene with a stairway as depicted in Fig. 4a. The raw range

image of the SR-3000 is shown in Fig. 4b and the NERI

representation of the range data in Fig. 4c. The NC

algorithm partitions the NERI into 100 segments as shown

in Fig. 4d. Fig. 4e displays selected planar segments. A

token in blue indicates that the corresponding segment

belongs to a vertical plane and a token in green means that

the segment lies on a horizontal plane. Finally the extracted

planes are shown in Fig. 4f. We can see that the majority of

the planar surfaces are correctly extracted. We can also

observe in Fig. 4f that region P (marked in red) is

under-extracted because its adjacent regions (circled in

yellow) are misclassified. As we can see from Fig. 4e that

pixels inside region A or B are classified as in the same

segment by the NC algorithm. However, each of them

contains data points on different planar surfaces that are

perpendicular to one another in the 3D space. They are

considered as non-planar segments due to their large PFE’s.

It should be noted that the misclassification will not have

big impact in recognizing the stairway and guiding the

robot. This is because there are enough number of treads

and risers indentified and the misclassification occurred at a

location far away from the robot. We can also see that there

are minor misclassifications at the right or left end of some

extracted planar segments. This suggest for future research

effort that go beyond the scope of this paper. Fig. 4g

renders the segmentation result in a 3D point cloud where

the unclassified points are represented in black.

The 2nd experiment shows the plane extraction of a

hallway. The result is depicted in Fig. 5. We can see that the

floor, door and the wall regions have been properly

extracted. Fig. 6 displays the result of our experiment in the

lobby in the ETAS building at the University of Arkansas at

Little Rock. The result demonstrates a satisfactory planar

feature extraction of our proposed method.

V. DISCUSSION

We have seen in the previous section that some segments

(mainly in the boundary regions) fail to qualify as planar

ones due to misclassification in the initial clustering phase.

In this section we put forward a recursive approach that

may identify planar pixels from the non-planar segment and

hence can extract a plane to its entirety.

(a) (b)

(e)

(f) (g)

Fig. 4 Segmentation of the scene with a stairway: (a) Actual Scene, (b)

Raw range image, (c) NERI of (b), (d) Initial clustering results from the

C, (e) Labeling of vertical and horizontal planar segments, (f)

Segmentation results after merging the homogeneous segments in (e), (g)

Segmented data shown in a 3D point cloud.

Consider Fig. 7a which is a magnified view of region A

in Fig.4f. Our objective is to extract the part of A that

belongs to the adjacent vertical plane P (Fig. 4f). To

achieve this we apply our proposed method recursively as

follows:

a) The NC algorithm is applied to A with N=2. This

breaks A into two sub-segments

b) For each sub-segment compute the PFE and normal of

its LSP.

c) A sub-segment is then merged with P or discarded

based on the criterion we set forth in Section IV.

d) If none of the sub-segments is merged with P, the NC

is again applied to A with N=N+1, i. e., repeat the

process with N=N+1.

In this example, we did not make a merger with N=2. But

with N=3 we obtained three sub-segments (C, D, E in Fig.

7a). Through steps b) and c), a larger segment (Fig. 7b) is

formed with C joining P. From this we can see that a finer

segmentation can be achieved by further splitting the

non-planar segments that are adjacent to the extracted

segments (blue and green segments in this case).

(a) (b)

Fig. 7: (a) Enlarged view of a misclassified cluster, (b) Segmentation result

after extracting planar region from (a).

Currently, the proposed method is implemented in

Matlab using the Normalized Cuts library [ 15 ]. As a

consequence it is not real-time. Efforts are being made to

translate the code entirely into C/C++. We expect that this

will significantly reduce the runtime.

VI. CONCLUSION AND FUTURE WORK

We have presented a method that may reliably extract the

vertical and horizontal planes from the range images

captured by the SwissRanger SR-3000. We use a split and

merge approach to achieve this. In the proposed method,

surface normal information is used to convert a range image

into a NERI for image enhancement. To deal with surface

normal errors caused by noise in range data we apply the

NC algorithm over a NERI to get homogenous segments.

They are merged to form larger segments (horizontal and

vertical planes) based on the LSP fitting data statistics. Our

method works efficiently without a prior knowledge of the

number of vertical or horizontal planes in a scene. We also

have also demonstrated that under-extraction of planes due

to misclassification of non-planar segments can be resolved

by further splitting the related segment and apply our

method to the sub-segments. We have validated the

method’s efficacy by real experiments in various indoor

environments. Although the method is intended for the

segmentation of flat surfaces, it can be adapted to handle

non-flat cases as well. One possible approach is to assume

that a non-flat surface comprises a number of planar

segments with small changes in their orientations

(normals). The merging of neighboring segments then takes

place based on the rate of change of their normal directions.

(a) (b)

(e) (f)

Fig. 5 Segmentation of the scene with a hallway:

(a) Actual Scene, (b) Range image from SR-3000, (c) NERI of (b), (d)

Initial clustering results from NCCT, (e) Segmentation results after

merging clusters, (g) Extracted points in 3D.

(a) (b)

(e) (f)

Fig. 6 Segmentation results of the scene with a Lobby:

(a) Actual Scene, (b) Range image from SR-3000, (c) NERI of (b), (d)

Initial clustering results from NCCT, (e) Segmentation results afte

merging clusters, (f) Extracted points in 3D.

It should be noted that we use a gray image for the NERI

in order to use the Normalized Cuts library. The mapping of

a tri-band image pixel to a NERI pixel is multiple-to-one. If

two non-parallel planes happen to have such orientations

that their pixels have the same brightness in theirs NERI

representations, the distinctness of the planes in the NERI

will be very small. This may potentially add difficulty to the

N-Cuts method. Fortunately, in this case the intersection of

the planes (a straight line segment) has a different normal

direction from both planes. This will likely help the N-Cuts

method locate correct planar segments. In case that the

N-Cuts method fails, we will have to use the tri-band color

image representation for NERIs and develop new N-Cuts

method to segment the color NERIs. We will carry out

more experiment to test this point in our future work.

Another direction for future research is to develop method

that may find the optimal number of clusters for the N-Cuts

method. This might improve the execution time of our

method in both splitting and merging phases.

The method proposed in this paper can be employed by a

mobile robot for autonomous navigation in indoor

environments.

REFERENCES

[1] R. Unnikrishnan and M. Hebert, “Robust extraction of multiple

structures from non-uniformly sampled data,” in Proc. IEEE/RSJ

International Conference on Intelligent Robots and Systems, 2003,

pp. 1322-1329.

[2] R. Triebel, W. Burgard, and F. Dellaert, “Using hierarchical EM to

extract planes from 3d range scans,” in Proc. IEEE International

Conference on Robotics and Automation, 2005, pp. 4437-4442.

[3] V. Don and H. Maarten Uijt de, “Near real-time extraction of planar

features from 3d flash-ladar video frames,” in Proc. SPIE, vol. 6977

of Optical Pattern Recognition, pp. 69770N-69770N-12, 2008.

[4] I. Stamos and P. E. Allen, “3-d model construction using range and

image data,” in Proc. IEEE International Conference on Computer

Vision and Pattern Recognition, 2000, pp. 531-536.

[5] A. Bab-Hadiashar and N. Gheissari, “Range image segmentation

using surface selection criterion,” IEEE Transactions on Image

Processing, vol. 15, no. 7, pp. 2006-2018, 2006.

[6] A. D. Sappa, “Automatic extraction of planar projections from

panoramic range images,” in Proc. 2nd International Symposium on

3D Data Processing, Visualization and Transmission, 2004, pp.

231-234.

[7] T. Oggier, et al., “An all-solid-state optical range camera for 3D

real-time imaging with sub-centimeter depth resolution,” in Proc.

SPIE, 2003, vol. 5249, pp. 534-545.

[8] T. Oggier, B. Büttgen,F. Lustenberger, “SwissRanger SR3000 and

first experiences based on miniaturized 3D-TOF Cameras,” Swiss

Center for Electronics and Microtechnology (CSEM) Technical

Report, 2005.

[9] C. Ye and J. Borenstein, “Characterization of a 2-D laser scanner for

mobile robot obstacle negotiation,” in Proc. IEEE International

Conference on Robotics and Automation, 2002, pp. 2512-2518.

[10] S. A. Guomundsson, H. Aanaes, and R. Larsen, “Environmental

effects on measurement uncertainties of time-of-flight cameras,” in

International Symposium on Signals, Circuits and Systems, 2007, pp.

1-4.

[11] K. Young Min, D. Chan, C. Theobalt, and S. Thrun, “Design and

calibration of a multi-view TOF sensor fusion system,” in Proc.

IEEE Computer Society Conference on Computer Vision and Pattern

Recognition Workshops, 2008, pp. 1-7.

[12] G. M. Hegde and C.Ye, “Swissranger sr-3000 range images

enhancement by a singular value decomposition filter,” in Proc.

IEEE International Conference on Information and Automation,

2008, pp. 1626-1631.

[13] S. Jianbo and J. Malik, “Normalized cuts and image segmentation,”

IEEE Transactions on Pattern Analysis and Machine Intelligence,

vol. 22, no. 8, pp. 888-905, 2000.

[14] K. Pulli, “Vision methods for an autonomous machine based on

range imaging,” Master’s Thesis, University of Oulu.

[15] http://www.cis.upenn.edu/~jshi/software/

SMC14 Color

Data

Full-text available

Sep 2014

ToF cameras for active vision in robotics

Article

Full-text available

Oct 2014
SENSOR ACTUAT A-PHYS

ToF cameras are now a mature technology that is widely being adopted to provide sensory input to robotic applications. Depending on the nature of the objects to be perceived and the viewing distance, we distinguish two groups of applications: those requiring to capture the whole scene and those centered on an object. It will be demonstrated that it is in this last group of applications, in which the robot has to locate and possibly manipulate an object, where the distinctive characteristics of ToF cameras can be better exploited. After presenting the physical sensor features and the calibration requirements of such cameras, we review some representative works highlighting for each one which of the distinctive ToF characteristics have been more essential. Even if at low resolution, the acquisition of 3D images at frame-rate is one of the most important features, as it enables quick background/foreground segmentation. A common use is in combination with classical color cameras. We present three developed applications, using a mobile robot and a robotic arm, to exemplify with real images some of the stated advantages.

NCC-RANSAC: A fast plane extraction method for 3-D range data segmentation

Article

Full-text available

Apr 2014

This paper presents a new plane extraction (PE) method based on the random sample consensus (RANSAC) approach. The generic RANSAC-based PE algorithm may over-extract a plane, and it may fail in case of a multistep scene where the RANSAC procedure results in multiple inlier patches that form a slant plane straddling the steps. The CC-RANSAC PE algorithm successfully overcomes the latter limitation if the inlier patches are separate. However, it fails if the inlier patches are connected. A typical scenario is a stairway with a stair wall where the RANSAC plane-fitting procedure results in inliers patches in the tread, riser, and stair wall planes. They connect together and form a plane. The proposed method, called normal-coherence CC-RANSAC (NCC-RANSAC), performs a normal coherence check to all data points of the inlier patches and removes the data points whose normal directions are contradictory to that of the fitted plane. This process results in separate inlier patches, each of which is treated as a candidate plane. A recursive plane clustering process is then executed to grow each of the candidate planes until all planes are extracted in their entireties. The RANSAC plane-fitting and the recursive plane clustering processes are repeated until no more planes are found. A probabilistic model is introduced to predict the success probability of the NCC-RANSAC algorithm and validated with real data of a 3-D time-of-flight camera-SwissRanger SR4000. Experimental results demonstrate that the proposed method extracts more accurate planes with less computational time than the existing RANSAC-based methods.

NCC-RANSAC: A fast plane extraction method for navigating a smart cane for the visually impaired

Conference Paper

Full-text available

Aug 2013

This paper presents a new RANSAC based method for extracting planes from 3D range data. The generic RANSAC Plane Extranction (PE) method may over-extract a plane. It may fail in the case of a multi-step scene where the RANSAC process results in multiple inlier patches that form a slant plane straddling the steps. The CC-RANSAC algorithm overcomes the latter limitation if the inlier patches are separate. However, it fails when the inlier patches are connected. A typical scenario is a stairway with a stairwall. In this case the RANSAC plane-fitting produces inlier patches (in the tread, riser and stairwall planes) that connect together to form a plane. The proposed method, called NCC-RANSAC, performs a normal-coherence check to all data points of the inlier patches and removes those points whose normal directions are contradictory to that of the fitted plane. This procedure results in a set of separate inlier patches, each of which is then extended into a plane in its entirety by a recursive plane clustering process. The RANSAC plane-fitting and recursive plane clustering processes are repeated until no more planes are found. A probabilistic model is introduced to predict the success probability of the NCC-RANSAC method and validated with the real data of a 3D cameraSwissRanger SR4000. Experimental results demonstrate that the proposed method extracts more accurate planes with less computational time than the existing RANSAC based methods. The proposed method is intended to be used by a robotic navigational device for the visually impaired for object detection/recognition in indoor environments.

Object segmentation and classification using 3-D range camera

Article

Jan 2014
J VIS COMMUN IMAGE R

Robot Guidance Using Machine Vision Techniques in Industrial Environments: A Comparative Review

Article

Full-text available

Mar 2016
SENSORS-BASEL

In the factory of the future, most of the operations will be done by autonomous robots that need visual feedback to move around the working space avoiding obstacles, to work collaboratively with humans, to identify and locate the working parts, to complete the information provided by other sensors to improve their positioning accuracy, etc. Different vision techniques, such as photogrammetry, stereo vision, structured light, time of flight and laser triangulation, among others, are widely used for inspection and quality control processes in the industry and now for robot guidance. Choosing which type of vision system to use is highly dependent on the parts that need to be located or measured. Thus, in this paper a comparative review of different machine vision techniques for robot guidance is presented. This work analyzes accuracy, range and weight of the sensors, safety, processing time and environmental influences. Researchers and developers can take it as a background information for their future works.

Planar Segment Based Three-dimensional Point Cloud Registration in Outdoor Environments

Article

Jul 2013
J FIELD ROBOT

We present an odometry-free three-dimensional (3D) point cloud registration strategy for outdoor environments based on area attributed planar patches. The approach is split into three steps. The first step is to segment each point cloud into planar segments, utilizing a cached-octree region growing algorithm, which does not require the 2.5D image-like structure of organized point clouds. The second step is to calculate the area of each segment based on small local faces inspired by the idea of surface integrals. The third step is to find segment correspondences between overlapping point clouds using a search algorithm, and compute the transformation from determined correspondences. The transformation is searched globally so as to maximize a spherical correlation-like metric by enumerating solutions derived from potential segment correspondences. The novelty of this step is that only the area and plane parameters of each segment are employed, and no prior pose estimation from other sensors is required. Four datasets have been used to evaluate the proposed approach, three of which are publicly available and one that stems from our custom-built platform. Based on these datasets, the following evaluations have been done: segmentation speed benchmarking, segment area calculation accuracy and speed benchmarking, processing data acquired by scanners with different fields of view, comparison with the iterative closest point algorithm, robustness with respect to occlusions and partial observations, and registration accuracy compared to ground truth. Experimental results confirm that the approach offers an alternative to state-of-the-art algorithms in plane-rich environments.

Fast incremental 3D plane extraction from a collection of 2D line segments for 3D mapping

Conference Paper

Oct 2012
Rep U S

Three-dimensional (3D) data processing has recently acquired greater importance in solving complex tasks such as object recognition, environment modeling, and robotic mapping and localization. Since using raw 3D data without preprocessing is very time-consuming, extraction of geometric features that describe the environment concisely is essential. A plane is a suitable geometric feature due to its richness and simplicity of extraction. This paper presents an online incremental plane extraction method using line segments. Since our system is based on a nodding laser scanner, we exploit the incremental nature of data acquisition in which physical rotation and algorithm implementation are conducted in parallel. In contrast to other plane extraction methods, line segments defined by two end points become supporting elements that comprise a plane, so we need not handle all the scan points once the line segments are extracted from each scan slice. This reduces the algorithms complexity and the computation time. Experimental validation and comparison with state of the art method were conducted using tens of complete scan data sets acquired from a typical indoor environment.

A study on plane extraction from distance images using 3D Hough transform

Conference Paper

Full-text available

Nov 2012

Studies have been conducted to detect object planes in 3D using 3D distant measurement camera such as Swiss Ranger SR-4000. In those studies, first, objects in the gray scale range image from the camera are emphasized. Then, emphasized image is grouped regarding the depth information and parameters for planes are calculated using least-square-method. Here, emphasis is sensitive enough, but ambiguity of gray scale range image causes error detections. Considering this problem, a new plane detection method is proposed using 3D Hough transform. According to the experiments, new proposal is effective and it showed better results compared to the previous methods.

Three-dimensional point cloud plane segmentation in both structured and unstructured environments

Article

Dec 2013
ROBOT AUTON SYST

This paper focuses on three-dimensional (3D) point cloud plane segmentation. Two complementary strategies are proposed for different environments, i.e., a subwindow-based region growing (SBRG) algorithm for structured environments, and a hybrid region growing (HRG) algorithm for unstructured environments. The point cloud is decomposed into subwindows first, using the points’ neighborhood information when they are scanned by the laser range finder (LRF). Then, the subwindows are classified as planar or nonplanar based on their shape. Afterwards, only planar subwindows are employed in the former algorithm, whereas both kinds of subwindows are used in the latter. In the growing phase, planar subwindows are investigated directly (in both algorithms), while each point in nonplanar subwindows is investigated separately (only in HRG). During region growing, plane parameters are computed incrementally when a subwindow or a point is added to the growing region. This incremental methodology makes the plane segmentation fast. The algorithms have been evaluated using real-world datasets from both structured and unstructured environments. Furthermore, they have been benchmarked against a state-of-the-art point-based region growing (PBRG) algorithm with regard to segmentation speed. According to the results, SBRG is 4 and 9 times faster than PBRG when the subwindow size is set to 3×3 and 4×4 respectively; HRG is 4 times faster than PBRG when the subwindow size is set to 4×4. Open-source code for this paper is available at https://github.com/junhaoxiao/TAMS-Planar-Surface-Based-Perception.git.

SwissRanger SR3000 and First Experiences based on Miniaturized 3D-TOF Cameras

Article

Full-text available

Jan 2005

The latest development in range imaging 3D-cameras is presented, the so-called SwissRanger3000 (SR3000). The SR3000 includes a modularly built electronics stack. This modular stack allows for easy adjustments to the camera for specific customer requirements. The core of the SR3000 is comprised of a background suppressing 3D-sensor with QCIF resolution. Furthermore, the first evaluation results of field-tests using a 3D-camera developed at CSEM are presented. The outcome of the evaluation phase for the automotive industry is discussed. A new development using the camera as detecting device for a virtual interaction on a large screen has successfully been implemented. Finally, the latest breakthrough in distance accuracy enables the 3D-TOF camera to enter the biometrics markets.

Near real-time extraction of planar features from 3D flash-ladar video frames

Article

Full-text available

Mar 2008
Proceedings of SPIE

This paper describes a novel method used to extract planar surfaces from a stream of 3D images in near real-time. The method currently operates on 3D images acquired from a MESA SwissRanger SR-3000 infrared time of flight camera, which operates in a manner similar to flash-ladar sensors; the camera provides the user with range and intensity value for each pixel in the 176 by 144 image frame. After application of the camera calibration the range measurement associated with each pixel can be converted to a Cartesian coordinate. First, the proposed method splits the focal image plane into sub-images or sub-windows. The method then operates in the 3D parameter space to find an estimate of the planar equation best describing the point cloud associated with the window pixels and to compute a metric that defines how well the sub-window points fit to the planar estimate. The best fit sub-window is then used as an initialization to one of two investigated methods: a parameter based search technique and cluster validation using histogram thresholding to extract the entire plane from the 3D image frame. Once a plane is extracted, a feature vector describing that plane along with their describing statistics can be generated. These feature vectors can then be used to enable feature-based navigation. The paper will fully describe the feature extraction method and will provide application results of this method to extract features from indoor D video data obtained with the MESA SwissRanger SR-3000. Also provided is a brief overview of the generation of feature statistics and their importance.

An all-solid-state optical range camera for 3D real-time imaging with sub-centimeter depth resolution (SwissRanger TM)

Article

Full-text available

Feb 2004
Proceedings of SPIE

A new miniaturized camera system that is capable of 3-dimensional imaging in real-time is presented. The compact imaging device is able to entirely capture its environment in all three spatial dimensions. It reliably and simultaneously delivers intensity data as well as range information on the objects and persons in the scene. The depth measurement is based on the time-of-flight (TOF) principle. A custom solid-state image sensor allows the parallel measurement of the phase, offset and amplitude of a radio frequency (RF) modulated light field that is emitted by the system and reflected back by the camera surroundings without requiring any mechanical scanning parts. In this paper, the theoretical background of the implemented TOF principle is presented, together with the technological requirements and detailed practical implementation issues of such a distance measuring system. Furthermore, the schematic overview of the complete 3D-camera system is provided. The experimental test results are presented and discussed. The present camera system can achieve sub-centimeter depth resolution for a wide range of operating conditions. A miniaturized version of such a 3D-solid-state camera, the SwissRanger TM 2, is presented as an example, illustrating the possibility of manufacturing compact, robust and cost effective ranging camera products for 3D imaging in real-time.

Using Hierarchical EM to Extract Planes from 3D Range Scans

Conference Paper

Full-text available

May 2005

Recently, the acquisition of three-dimensional maps has become more and more popular. This is motivated by the fact that robots act in the three-dimensional world and several tasks such as path planning or localizing objects can be carried out more reliable using three-dimensional representations. In this paper we consider the problem of extracting planes from three-dimensional range data. In contrast to previous approaches our algorithm uses a hierarchical variant of the popular Expectation Maximization (EM) algorithm [1] to simultaneously learn the main directions of the planar structures. These main directions are then used to correct the position and orientation of planes. In practical experiments carried out with real data and in simulations we demonstrate that our algorithm can accurately extract planes and their orientation from range data.

SwissRanger SR-3000 Range Images Enhancement by a Singular Value Decomposition Filter

Conference Paper

Full-text available

Jul 2008

This paper presents a preliminary study on the enhancement of the SwissRanger SR-3000psilas range images by a singular value decomposition filtering method. The image enhancement is performed by converting a conventional range image into an enhanced range image where each pixelpsilas intensity embodies the surface normal and depth information of the corresponding pixel in the original image. This representation of range image makes an objectpsilas edges distinctive. But it corrupts the objectpsilas surfaces due to the noise in range data. We propose a filtering method based on the Singular Value Decomposition to alleviate the surface corruption and preserve the edges. The efficacy of the proposed method is validated by numerous experiments in various environments.

Characterization of a 2-D Laser Scanner for Mobile Robot Obstacle Negotiation.

Conference Paper

Full-text available

Jan 2002

This paper presents a characterization study of the Sick LMS 200 laser scanner. A number of parameters, such as operation time, data transfer rate, target surface properties, as well as the incidence angle, which may potentially affect the sensing performance, are investigated. A probabilistic range measurement model is built based on the experimental results. The paper also analyzes the mixed pixels problem of the scanner.

Environmental Effects on Measurement Uncertainties of Time-of-Flight Cameras

Conference Paper

Full-text available

Aug 2007

In this paper the effect the environment has on the SwissRanger SR3000 time-of-flight camera is investigated. The accuracy of this camera is highly affected by the scene it is pointed at: such as the reflective properties, color and gloss. Also the complexity of the scene has considerable effects on the accuracy. To mention a few: The angle of the objects to the emitted light and the scattering effects of near objects. In this paper a general overview of known such inaccuracy factors are described, followed by experiments illustrating the additional uncertainty factors. Specifically we give a better description of how a surface color intensity influences the depth measurement, and illustrate how multiple reflections influence the resulting depth measurement.

An All-solid-state Optical Range Camera for 3D Real-time Imaging with Sub-centimeter Depth Resolutio

Article

Design and Calibration of a Multi-view TOF Sensor Fusion System

Article

Jun 2008

This paper describes the design and calibration of a sys-tem that enables simultaneous recording of dynamic scenes with multiple high-resolution video and low-resolution Swissranger time-of-flight (TOF) depth cameras. The sys-tem shall serve as a testbed for the development of new algorithms for high-quality multi-view dynamic scene re-construction and 3D video. The paper also provides a detailed analysis of random and systematic depth camera noise which is important for reliable fusion of video and depth data. Finally, the paper describes how to compensate systematic depth errors and calibrate all dynamic depth and video data into a common frame.

Normalized Cuts and Image Segmentation

Article

Jan 1997

We propose a novel approach for solving the perceptual grouping problem in vision. Rather than focusing on local features and their consistencies in the image data, our approach aims at extracting the global impression of an image. We treat image segmentation as a graph partitioning problem and propose a novel global criterion, the normalized cut, for segmenting the graph. The normalized cut criterion measures both the total dissimilarity between the different groups as well as the total similarity within the groups. We show that an efficient computational technique based on a generalized eigenvalue problem can be used to optimize this criterion. We have applied this approach to segmenting static images, as well as motion sequences, and found the results to be very encouraging.

Extraction of Planar Features from Swissranger SR-3000 Range Images by a Clustering Method Using Normalized Cuts

Abstract

Recommended publications

A recursive planar feature extraction method for 3D range data segmentation

An extended normalized cuts method for real-time planar feature extraction from noisy range images

SwissRanger SR-3000 Range Images Enhancement by a Singular Value Decomposition Filter

Robust edge extraction for SwissRanger SR-3000 range images