Conference PaperPDF Available

Stereo correspondence with slanted surfaces: critical implications of horizontal slant

January 2004

January 2004

DOI:10.1109/CVPR.2004.1315082

Source
IEEE Xplore

Conference: Computer Vision and Pattern Recognition, 2004. CVPR 2004. Proceedings of the 2004 IEEE Computer Society Conference on
Volume: 1

Authors:

Yiannis Aloimonos

University of Maryland, College Park

We examine the stereo correspondence problem in the presence of slanted scene surfaces. In particular we highlight a previously overlooked geometric fact: a horizontally slanted surface (i.e. having depth variation in the direction of the separation of the two cameras) will appear horizontally stretched in one image as compared to the other image. Thus, while corresponding two images, N pixels on a scanline in one image may correspond to a different number of pixels M in the other image. This leads to three important modifications to existing stereo algorithms: (a) due to unequal sampling, intensity matching metrics such as the popular Birchfield-Tomasi procedure must be modified, (b) unequal numbers of pixels in the two images must be allowed to correspond to each other, and (c) the uniqueness constraint, which is often used for detecting occlusions, must be changed to a 3D uniqueness constraint. This paper discusses these new constraints and provides a simple scanline based matching algorithm for illustration. We experimentally demonstrate test cases where existing algorithms fail, and how the incorporation of these new constraints provides correct results. Experimental comparisons of the scanline based algorithm with standard data sets are also provided.

The modified uniqueness constraint operates by preserving a one-to-one correspondence between intervals on the left and right scanlines, instead of pixels

…

Top row (Left frames), Middle row (ground truth), Bottom row (our results). Occlusions were filled in before performing the evaluation.

…

Horizontally slanted object. Top row: left image, right image. Bottom row: (left) results using graph cuts [11], (right) our results. Occlusions are shown in black

…

Figures - uploaded by Yiannis Aloimonos

Content may be subject to copyright.

Content uploaded by Yiannis Aloimonos

Content may be subject to copyright.

Stereo correspondence with slanted surfaces: critical implications of horizontal

slant

Abhijit S. Ogale and Yiannis Aloimonos

Center for Automation Research, University of Maryland, College Park, MD 20742

{ogale, yiannis}@cfar.umd.edu

Abstract

We examine the stereo correspondence problem in the

presence of slanted scene surfaces. In particular, we high-

light a previously overlooked geometric fact: a horizontally

slanted surface (i.e. having depth variation in the direction

of the separation of the two cameras) will appear horizon-

tally stretched in one image as compared to the other image.

Thus, while corresponding two images, N pixels on a scan-

line in one image may correspond to a different number of

pixels M in the other image. This leads to three important

modiﬁcations to existing stereo algorithms: (a) due to un-

equal sampling, intensity matching metrics such as the pop-

ular Birchﬁeld-Tomasi procedure must be modiﬁed, (b) un-

equal numbers of pixels in the two images must be allowed

to correspond to each other, and (c) the uniqueness con-

straint, which is often used for detecting occlusions, must

be changed to a 3D uniqueness constraint. This paper dis-

cusses these new constraints and provides a simple scanline

based matching algorithm for illustration. We experimen-

tally demonstrate test cases where existing algorithms fail,

and how the incorporation of these new constraints provides

correct results. Experimental comparisons of the scanline

based algorithm with standard data sets are also provided.

1. Introduction

The dense stereo correspondence problem consists of

ﬁnding a mapping between the points in two images of a

scene. If the images have been rectiﬁed, then a point Sin

one image may correspond to a point S0in the other image,

where Sand S0lie on the same horizontal scanline. The

difference in the horizontal position of Sand S0is termed

as horizontal disparity. In this paper, we assume that we are

dealing with a rectiﬁed pair of images.

1.1. Previous work

There exists a considerable body of work on the dense

stereo correspondence problem. Scharstein and Szeliski

[19] have provided an exhaustive comparison of dense

stereo correspondence algorithms. Most algorithms gen-

erally utilize local measurements such as image intensity

(or color) and phase, and aggregate information from mul-

tiple pixels using smoothness constraints. The simplest

method of aggregation is to minimize the matching error

within rectangular windows of ﬁxed size [16]. Better ap-

proaches utilize multiple windows [8, 7], adaptive win-

dows [10] which change their size in order to minimize the

error, shiftable windows [4, 21], or predicted windows [14],

all of which give performance improvements at discontinu-

ities.

Global approaches to solving the stereo correspondence

problem rely on the extremization of a global cost func-

tion or energy. The energy functions which are used in-

clude terms for local property matching (‘data term’), ad-

ditional smoothness terms, and in some cases, penalties for

occlusions. Depending on the form of the energy function,

the most efﬁcient energy minimization scheme can be cho-

sen. These include dynamic programming [15], simulated

annealing [9, 1], relaxation labeling [20], non-linear diffu-

sion [18], maximum ﬂow [17] and graph cuts [5, 11]. Max-

imum ﬂow and graph cut methods provide better computa-

tional efﬁciency than simulated annealing for energy func-

tions which possess a certain set of properties. Some of

these algorithms treat the images symmetrically and explic-

itly deal with occlusions (eg. [11]). The uniqueness con-

straint [13] is often used to ﬁnd regions of occlusion. Egnal

and Wildes [6] provide comparisons of various approaches

for ﬁnding occlusions.

Recently, some algorithms [3] have explicitly incorpo-

rated the estimation of slant while performing the estima-

tion of horizontal disparity. Lin and Tomasi [12] explicitly

model the scene using smooth surface patches and also ﬁnd

occlusions; they initialize their disparity map with integer

disparities obtained using graph cuts, after which surface

Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’04)

ﬁtting and segmentation are performed repeatedly.

1.2. Our approach

We explicitly examine the stereo correspondence prob-

lem in the presence of horizontally slanted scene surfaces.

In particular, we lay emphasis on the following geometric

effect: a horizontally slanted surface (ie. having depth vari-

ation in the direction of the separation of the two cameras)

will appear horizontally stretched in one image as compared

to the other image. Thus, when we correspond two images,

N pixels on a scanline in one image must be allowed to cor-

respond with a different number of pixels M in the other

image. Furthermore, it is evident that the intensity function

on the true horizontally slanted scene surface is sampled

differently by the two cameras, which is another low-level

effect which needs to be dealt with. Also, the uniqueness

constraint, which is often used to ﬁnd occlusions by forc-

ing a one-to-one correspondence between pixels, is not true

for horizontally slanted surfaces, since a N-to-M correspon-

dence is possible. Hence, the uniqueness constraint must

be reformulated in terms of scene visibility in the presence

of horizontally slanted surfaces. In Section 2, we exam-

ine the above ideas and underscore the need for the treat-

ment of horizontal slant in the ﬁrst stage of any stereo al-

gorithm during disparity estimation itself, rather than as a

post-processing or a feedback step. For the sake of illustra-

tion, we present a simple scanline based algorithm in Sec-

tion 3 which makes use of these constraints, and provide

experimental comparisons with existing algorithms using

standard data sets in Section 4.

Figure 1. (Left) unequal projection lengths of

a horizontally slanted line (Right) equal pro-

jection lengths of a fronto-parallel line

Figure 2. Sampling problem for a horizontally

slanted line

2. Effects of Horizontal Slant

2.1. Unequal projection lengths

Using a 1D camera, Figure 1 shows on the left, how a

horizontally slanted line DE in the scene projects onto the

line segment d1e1in camera F1,andd2e2in camera F2.

Clearly, the lengths of d1e1and d2e2are not equal. Assume

that the cameras have focal length equal to 1. Let the point

Dhave coordinates ([D>]

D)in space with respect to cam-

era 1, and point Ehave coordinates ([E>]

E), where the

[-axis is along the scanline, and the ]-axisisnormalto

the scanline. Then, if the cameras are separated by a trans-

lation w, we can immediately ﬁnd the lengths O1and O2of

the projected line segments in the two cameras.

O1=[E@]E[D@]D

O2=([Ew)@]E([Dw)@]D

(1)

Clearly, in general, O1and O2are not equal. For the

fronto-parallel line shown in Figure 1 on the right, ]D=

]E=], hence

O1=O2=([E[D)@] (2)

Thus, except for the fronto-parallel case, horizontally

slanted line segments in space will always project onto seg-

ments of different lengths in the two cameras. Hence, N

pixels on a scanline in one image can correspond to a dif-

ferent number of pixels M on a scanline in the other image.

Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’04)

We must therefore make a provision in our stereo algorithms

to permit unequal correspondences of this nature.

2.2. Sampling

Since a horizontally slanted line segment in space has

different projection lengths in the two cameras, it’s inten-

sity function is also sampled differently by the two cameras

as shown in Figure 2. Birchﬁeld and Tomasi [2] have pro-

vided a very useful method for matching pixel intensities,

which is insensitive to image sampling. However, due to

unequal sampling in the presence of horizontal slant, we

must ﬁrst resample each scanline correctly, and then apply

the Birchﬁeld-Tomasi matching procedure, which only uses

nearest neighbor pixels for interpolation. In other words, we

ﬁrst stretch (resample) one of the scanlines, by an amount

related to the slant we are considering, and then match this

stretched scanline with the other unstretched scanline us-

ing the Birchﬁeld-Tomasi matching process as usual. For

example, if we are considering the linear correspondence

function {2=p{1+gbetween points of camera 1 and 2,

then we must stretch the image of camera 1 by a factor p

before performing the intensity based matching.

2.3. Occlusions and the uniqueness constraint

The uniqueness constraint [13] is often used to ﬁnd oc-

clusions. In its present form, the uniqueness constraint

forces a one-to-one correspondence between pixels in the

two images. In the end, the unpaired pixels are the occlu-

sions. However, since horizontal slant allows N pixels in

one image to match with a different number of pixels M in

the other image, we can no longer impose a one-to-one cor-

respondence for ﬁnding occlusions. We must modify the

uniqueness constraint so that we enforce a one-to-one map-

ping between continuous intervals (line segments) in the

two scanlines, instead of pixels. An interval in one scan-

line may correspond to an interval of a different length in

the other scanline, as long as the correspondence is unique.

This is equivalent to enforcing uniqueness in the scene

space instead of the image space, hence we may also refer

to this constraint as the 3D uniqueness constraint.

Figure 3 shows how the modiﬁed uniqueness constraint

is used. Part (a) shows an existing one-to-one correspon-

dence between intervals on the left and right scanlines. This

denotes an intermediate state in the progress of a stereo

matching and segmentation algorithm. Notice that the in-

tervals may correspond in any order (ie. the ordering con-

straint is not needed). Now, in part (b), we wish to insert a

new pair of corresponding intervals, shown by dashed lines.

(This new pair of matching intervals improves upon the

existing matches according to some energy metric which

depends on the stereo algorithm being used). In part (c),

we see that the insertion of this pair of intervals conﬂicts

with existing intervals (shown in gray). In order to enforce

uniqueness, the gray pair of intervals on the right must be

removed, while the gray pair of intervals on the left must be

resized. In part (d), we see the new correspondences. The

interval pair which was resized is shown in gray, and the

inserted interval is shown as dashed.

3. Scanline stereo algorithm

We now describe a simple algorithm to illustrate how the

above ideas may be implemented. For simplicity, the al-

gorithm processes a pair of scanlines LO({)and LU({)at a

time without using any vertical consistency constraints (the

results are post-processed by a simple median ﬁlter). Hor-

izontal disparities O({)are assigned to the left scanline

within a given range [1>2],andU({)to the right scan-

line in the range [2>1]=Notice that the disparities

are not assigned to pixels, but continuously over the whole

scanline. The disparities are not directly estimated, but in-

stead, we search for functions pO({)and gO({)for the left

scanline, and pU({)and gU({)for the right scanline, such

that given a point {Oon the left scanline, its corresponding

point {Uin the right scanline would be

{U=pO({O)·{O+gO({O)

and reciprocally:

{O=pU({U)·{U+gU({U)

Clearly,

pU({U)=1@pO({O)

gU({U)=gO({O)@pO({O)

The disparities are then computed as:

O({O)={U{O=(pO({O)1) ·{O+gO({O)

U({U)={O{U=(pU({U)1) ·{U+gU({U)

The function pOand pUare horizontal slants, which

allow line segments of different length in the two scanlines

to correspond. The scanlines are represented continuously

by linearly interpolating intensities between pixel locations.

Thus, if pO=2, then the left scanline is stretched (resam-

pled) by a factor of 2, and then matched with the unstretched

right scanline using the Birchﬁeld-Tomasi method. Due to

the stretching of one scanline before performing the inten-

sity based matching, we are automatically modifying the

traditional Birchﬁeld-Tomasi method to properly deal with

horizontal slant. For each possible pOand gO, absolute in-

tensity differences between corresponding points are com-

puted, and thresholded by a threshold w. The best value of

pOand gOfor a point is chosen such that it maximizes

Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’04)

left

right

left

right

left

right

left

right

(a) Initial correspondence (b) Insert new pair of matching intervals

left

right

left

right

left

right

left

right

(a) Initial correspondence (b) Insert new pair of matching intervals

Figure 3. The modiﬁed uniqueness constraint operates by preserving a one-to-one correspondence

between intervals on the left and right scanlines, instead of pixels

the size of the matching line segment containing that point.

This is the simple global optimization which we perform to

choose among the possible disparities.

The values of the horizontal slant which are to be ex-

amined are provided as inputs, ie. pO>p

U5P,where

P={p1>p

2===> pn}. Thus, given the possible slants P

and the disparity search range [1>2], the possible values

of gOand gUfor each position can be restricted.

In order to ﬁnd the occlusions, we enforce the unique-

ness constraint in its modiﬁed form as shown in Figure 3.

We maintain a one-to-one correspondence between inter-

vals in the two scanlines. Hence, at any stage of the pro-

cess,wehaveasetVOof non-overlapping intervals in the

left scanline and a set VUof non-overlapping intervals in

the right scanline. An interval lis of the form [{1>{

2).The

uniqueness constraint enforces a one-to-one mapping Xbe-

tween the elements of VOand the elements of VU.Whena

new corresponding pair of intervals lOand lUis found, the

previous correspondences of segments in VOwhich overlap

with lOare removed, and the same is done for lUand VU.

Then, lOis added to VO,andlUto VU, and the one-to-one

mapping in Xis updated. Thus, we always ensure that a line

segment in the left scanline uniquely maps to a line segment

in the right scanline. In the end, line segments which remain

unmapped are the occlusions.

4. Experiments

Scharstein and Szeliski [19] have set up a test suite

(at www.middlebury.edu/stereo) of stereo image pairs along

with ground truth disparities for comparing the results of

dense stereo algorithms. The disparity map grxw generated

by an algorithm is compared to the true disparity gwuxh,and

the pixels which deviate by more than 1 unit from their true

disparity are termed as ‘bad’ pixels. The percentages of

bad pixels in the entire image, in the untextured regions and

near depth discontinuities are used to compare the results of

various algorithms. The percentages of bad pixels are re-

ported in Table 1, which was generated by submitting our

disparity maps (Figure 4) using the scanline algorithm to the

Middlebury website created by Scharstein et al (mentioned

earlier). The simple scanline algorithm presented earlier

(denoted ‘slanted scanline’ in the table) ranks ninth over-

all, while the ranks in each column are showed in brackets,

below the error percentages. This performance evaluation is

presented only for the sake of completeness, since the pri-

mary purpose of this paper is not to provide an algorithm,

but rather to understand the effects of horizontal slant, and

propose methods for correctly dealing with them. We ex-

pect that the constraints presented above will improve the

results of many existing stereo algorithms.

The correctness of our approach immediately becomes

evident when dealing with the stereo pair shown in Figure 5.

This pair of test images shows a black object which is hor-

Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’04)

Table 1. Performance comparison from the Middlebury Stereo Vision Page (overall rank is 9’th among

29 algorithms). The table shows only the top ten algorithms. Error percentages and rank in each

column (in brackets) is shown.

Rank Algorithm Tsukuba Sawtooth Venus Map

all untex disc. all untex disc. all untex disc. all disc.

1Segm.-based

1.23

( 3)

0.29

( 2)

6.94

( 4)

0.30

( 1)

0.00

( 1)

3.24

( 1)

0.08

( 1)

0.01

( 1)

1.39

( 1)

1.49

(21)

15.46

(26)

2Layered 1.58

( 5)

1.06

( 7)

8.82

( 6)

0.34

( 2)

0.00

( 1)

3.35

( 2)

1.52

( 9)

2.96

(18)

2.62

( 3)

0.37

(11)

5.24

(11)

3Belief prop 1.15

( 1)

0.42

( 3)

6.31

( 1)

0.98

( 9)

0.30

(14)

4.83

( 6)

1.00

( 5)

0.76

( 5)

9.13

(13)

0.84

(18)

5.27

(12)

4MultCam

1.85

( 9)

1.94

(14)

6.99

( 5)

0.62

( 6)

0.00

( 1)

6.86

(11)

1.21

( 7)

1.96

( 9)

5.71

( 7)

0.31

( 8)

4.34

(10)

5GC+occl 2b 1.19

( 2)

0.23

( 1)

6.71

( 2)

0.73

( 8)

0.11

( 8)

5.71

( 8)

1.64

(12)

2.75

(16)

5.41

( 6)

0.61

(14)

6.05

(13)

6Impr. Coop. 1.67

( 6)

0.77

( 5)

9.67

(10)

1.21

(13)

0.17

(11)

6.90

(12)

1.04

( 6)

1.07

( 6)

13.68

(18)

0.29

( 6)

3.65

( 7)

7GC+occl. 2a 1.27

( 4)

0.43

( 4)

6.90

( 3)

0.36

( 3)

0.00

( 1)

3.65

( 3)

2.79

(20)

5.39

(21)

2.54

( 2)

1.79

(22)

10.08

(20)

8Disc. pres. 1.78

( 7)

1.22

(10)

9.71

(11)

1.17

(11)

0.08

( 7)

5.55

( 7)

1.61

(11)

2.25

(12)

9.06

(12)

0.32

( 9)

3.33

( 6)

9Slanted

Scanline

1.82

( 8)

1.09

( 8)

9.47

( 8)

0.72

( 7)

0.24

(13)

6.00

( 9)

3.25

(21)

5.73

(22)

8.51

(11)

0.22

( 2)

3.10

( 4)

10 Graph cuts 1.94

(11)

1.09

( 9)

9.49

( 9)

1.30

(15)

0.06

( 6)

6.34

(10)

1.79

(15)

2.61

(15)

6.91

( 8)

0.31

( 7)

3.88

( 8)

29 Max. surf. 11.10

(29)

10.70

(27)

41.99

(29)

5.51

(29)

5.56

(29)

27.39

(28)

4.36

(24)

4.78

(20)

41.13

(28)

4.17

(28)

27.88

(28)

Figure 4. Top row (Left frames), Middle row (ground truth), Bottom row (our results). Occlusions were

ﬁlled in before performing the evaluation.

Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’04)

izontally slanted (depth decreases from left to right). The

second row of the ﬁgure shows on the left the output of the

graph cuts algorithm of Kolmogorov et al [11]. The graph

cuts result was obtained using software kindly provided by

the authors (www.cs.cornell.edu/People/vnk/software.html).

Our results are shown in the second row on the right hand

side. The graph cuts algorithm ﬁnds a constant disparity

value in the interior of the slanted object, which is clearly

incorrect. Our algorithm correctly shows the disparity of the

slanted object linearly decreasing from left to right (from

white to dark gray). The detected occlusions are shown in

black.

Figure 5. Horizontally slanted object. Top

row: left image, right image. Bottom row:

(left) results using graph cuts [11], (right) our

results. Occlusions are shown in black

5. Conclusions

We have discussed the effects of horizontal slant on the

stereo correspondence problem. We have shown that hor-

izontal slant leads to unequal projections in the two cam-

eras, which requires us to modify stereo algorithms for al-

lowing M-to-N pixel correspondences. Furthermore, we

have shown that horizontal slant leads to uneven sampling

of a surface by the two cameras, and hence local inten-

sity matching metrics must be suitably modiﬁed. Finally,

the uniqueness constraint for ﬁnding occlusions, which im-

poses a one-to-one correspondence between image pixels,

must be modiﬁed to enforce a one-to-one correspondence

between scanline intervals instead of pixels. We have also

presented a simple scanline based algorithm which imple-

ments these constraints, and provided experimental compar-

isons with existing methods.

References

[1] S. T. Barnard. Stochastic stereo matching over scale. IJCV,

3(1):17–32, 1989.

[2] S. Birchﬁeld and C. Tomasi. A pixel dissimilarity measure

that is insensitive to image sampling. IEEE Trans. PAMI,

20(4):401–406, 1998.

[3] S. Birchﬁeld and C. Tomasi. Multiway cut for stereo and

motion with slanted surfaces. ICCV, 1:489–495, 1999.

[4] A. F. Bobick and S. S. Intille. Large occlusion stereo. IJCV,

33(3):181–200, Sept 1999.

[5] Y. Boykov, O. Veksler, and R. Zabih. Fast approximate

energy minimization via graph cuts. IEEE Trans. PAMI,

23(11):1222–1239, Nov 2001.

[6] G. Egnal and R. Wildes. Detecting binocular half-

occlusions: empirical comparisons of ﬁve approaches. IEEE

Trans . PAMI, 24(8):1127–1133, Aug 2002.

[7] A. Fusiello, V. Roberto, and E. Trucco. Efﬁcient stereo with

multiple windowing. CVPR, pages 858–863, June 1997.

[8] D. Geiger, B. Ladendorf, and A. Yuille. Occlusions and

binocular stereo. ECCV, pages 425–433, 1992.

[9] S. Geman and D. Geman. Stochastic relaxation, gibbs distri-

butions, and the bayesian restoration of images. IEEE Trans.

PAM I , 6(6):721–741, Nov 1984.

[10] T. Kanade and M. Okutomi. A stereo matching algorithm

with an adaptive window: theory and experiment. IEEE

Trans . PAMI, 16(9):920–932, 1994.

[11] V. Kolmogorov and R. Zabih. Computing visual correspon-

dence with occlusions using graph cuts. ICCV, pages 508–

515, July 2001.

[12] M. Lin and C. Tomasi. Surfaces with occlusions from lay-

ered stereo. CVPR, 1:I–710–I–717, June 2003.

[13] D. Marr and T. Poggio. A computational theory of human

stereo vision. Proc. Royal Soc. London B, 204:301–328,

1979.

[14] J. Mulligan and K. Daniilidis. Predicting disparity windows

for real-time stereo. Lecture Notes in Computer Science,

1842:220–235, 2000.

[15] Y. Ohta and T. Kanade. Stereo by intra- and inter-scanline

search using dynamic programming. IEEE Trans. PAMI,

7(2):139–154, March 1985.

[16] M. Okutomi and T. Kanade. A multiple baseline stereo.

IEEE Trans. PAMI, 15(4):353–363, April 1993.

[17] S. Roy and I. Cox. A maximum-ﬂow formulation of the n-

camera stereo correspondence problem. ICCV, pages 492–

499, 1998.

[18] D. Scharstein and R. Szeliski. Stereo matching with nonlin-

ear diffusion. IJCV, 28(2):155–174, 1998.

[19] D. Scharstein and R. Szeliski. A taxonomy and evaluation

of dense two-frame stereo correspondence algorithms. IJCV,

47(1):7 – 42, April 2002.

[20] R. Szeliski. Bayesian modeling of uncertainty in low-level

vision. IJCV, 5(3):271–302, Dec 1990.

[21] H. Tao, H. Sawhney, and R. Kumar. A global matching

framework for stereo computation. ICCV, 1:532–539, July

2001.

Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’04)

DISPARITY ESTIMATION FROM MULTI-VIEW IMAGES AND VIDEO: GRAPH MODELS AND ALGORITHMS

Article

Jong Dae Oh

Stereo Matching

Thesis

Jan 2013

Michael Bleyer

Stereo Matching for Unconstrained Face Recognition Ph.D. Proposal

Article

Carlos D. Castillo

Mondrian stereo

Conference Paper

Sep 2017

Window-Based Three-Dimensional Aggregation for Stereo Matching

Article

Jul 2016

This paper presents a window-based three-dimens-ional (3-D) aggregation technique, which can approximate the surfaces of all kinds of objects, in stereo matching. The 3-D aggregation, which means to aggregate in 3-D surfaces, is implemented by decomposing the adaptive support window into horizontal segments; we allow the disparity to change smoothly in or between segments. Compared to traditional local stereo methods, the 3-D aggregation greatly improves the accuracy of results in slanted surfaces and occlusion areas while keeping excellent performance near depth discontinuities. We also propose an acceleration method that surprisingly improves the accuracy at the same time. The evaluation experiments confirm our achievements.

Urban 3D Reconstruction

Chapter

Dec 2009

Efficient methods using slanted support windows for slanted surfaces

Article

Mar 2016

The frontal-parallel assumption is made by many matching algorithms, but this assumption fails for slanted surfaces. This study proposes a matching algorithm intended to improve the matching results for slanted surfaces. First, a mathematical model is constructed to prove that slanted surfaces in the environment have corresponding slanted disparity surfaces in the disparity space image, and the model is to help find the proper plane parameters of slanted support windows, then improved cost aggregation and post-processing methods are proposed. The algorithm is tested using the Middlebury and Karlsruhe Institute of Technology and Toyota Technical Institute at Chicago (KITTI) benchmarks. The results demonstrate that the algorithm exhibits good performance and is efficient for slanted surfaces.

Efficient Stereo Matching Using Histogram Aggregation with Multiple Slant Hypotheses

Conference Paper

Jun 2013

This paper presents an enhancement to the recent framework of histogram aggregation [1], that enables to improve the matching accuracy while preserving a low computational complexity. The original algorithm uses a fronto-parallel support window for cost aggregation, which leads to inaccurate results in the presence of significant surface slant. We address the problem by considering a pre-defined set of discrete orientation hypotheses for the aggregation window. It is shown that a single orientation hypothesis in the Disparity Space Image is usually representative of a large interval of possible 3D slants, and that handling slant in the disparity space has the advantage of avoiding visibility issues. We also propose a fast recognition scheme in the Disparity Space Image volume for selecting the most likely orientation hypothesis for aggregation. The experiments clearly prove the effectiveness of the approach.

Modeling arbitrarily oriented slanted planes for efficient stereo vision based on block matching

Conference Paper

Oct 2014

Stereo cameras enable a 3D reconstruction of viewed scenes and are therefore well-suited sensors for many advanced driver assistance systems and autonomous driving. Modern algorithms for estimating distances for every image pixel achieve high-quality results, but their real-time capability is very limited. In contrast, window-based local methods can be implemented very efficiently but are more prone to errors. This is particularly true for spatial changes of distance within the matching window, most prominently on surfaces such as the road which are not parallel to but rather slanted towards the image plane. In this paper the authors present a method to compensate the impact of this effect for arbitrarily oriented sets of planes. It does not depend on any modifications to the actual distance estimation. Instead, it only applies specific transformations to input images and intermediate results. By combining this approach with existing implementations which efficiently use either multi-core or graphics processors, the authors are able to significantly increase quality while maintaining real-time throughputs on a compact target system.

3D entity-based stereo matching with ground control points and joint second-order smoothness prior

Article

Jul 2014

Disparity estimation for a scene with complex geometric characteristics such as slanted or highly curved surfaces is a basic and important issue in stereo matching. Traditional methods often use first-order smoothness priors that always lead to low-curvature frontal-parallel disparity maps. We propose a stereo framework that views the scene as a set of 3D entities with compact and smooth disparity distributions. The 3D entity-based representation enables some contributions to obtain a precise disparity estimation. A GCPs-plane constraint based on ground control points is used to strengthen the compact distributions of the disparities in each entity by restricting the scope of the disparity variance and reducing matching ambiguities in repetitive or low-texture areas. Furthermore, we have formulated a joint second-order smoothness prior, which combines a geometric weight with the derivative of disparity values. This prior encourages smooth disparity variations inside each entity and means that each entity is biased towards being a 3D planar surface. Segmentation is incorporated as soft constraint by effectively fusing the advantages of the image color gradient and GCPs-plane. This avoids blending of the foreground and background and retains only the disparity discontinuities from geometrically smooth regions with strong texture gradients. Our framework is formulated as a maximum a posteriori probability estimation problem that is optimized using the fusion-move approach. Evaluation results on the Middlebury benchmark show that the proposed method ranks second among the approximately \(152\) listed algorithms. In addition, it performs well in real-world scenes.

Large Occlusion Stereo

Article

Full-text available

Sep 1999

A method for solving the stereo matching problem in the presence of large occlusion is presented. A data structure—the disparity space image—is defined to facilitate the description of the effects of occlusion on the stereo matching process and in particular on dynamic programming (DP) solutions that find matches and occlusions simultaneously. We significantly improve upon existing DP stereo matching methods by showing that while some cost must be assigned to unmatched pixels, sensitivity to occlusion-cost and algorithmic complexity can be significantly reduced when highly-reliable matches, or ground control points, are incorporated into the matching process. The use of ground control points eliminates both the need for biasing the process towards a smooth solution and the task of selecting critical prior probabilities describing image formation. Finally, we describe how the detection of intensity edges can be used to bias the recovered solution such that occlusion boundaries will tend to be proposed along such edges, reflecting the observation that occlusion boundaries usually cause intensity discontinuities.

A stereo matching algorithm with an adaptive window: Theory and experiment

Conference Paper

Full-text available

May 1991

An iterative stereo matching algorithm is presented which selects a window adaptively for each pixel. The selected window is optimal in the sense that it produces the disparity estimate having the least uncertainty after evaluating both the intensity and the disparity variations within a window. The algorithm employs a statistical model that represents uncertainty of disparity of points over the window; the uncertainty is assumed to increase with the distance of the point from the center point. The algorithm is completely local and does not include any global optimization. Also, the algorithm does not use any post-processing smoothing, but smooth surfaces are recovered as smooth while sharp disparity edges are retained. Experimental results have demonstrated a clear advantage of this algorithm over algorithms with a fixed-size window, for both synthetic and real images

A Taxonomy And Evaluation Of Dense Two-Frame Stereo Correspondence Algorithms

Article

Jan 2000

Efficient Approximate Energy Minimization via Graph Cuts

Article

Jan 2001

Energy Minimization via Graph Cuts

Article

Jan 2001

Stochastic Relaxation, Gibbs Distributions and the Bayesian Resoration of Images

Article

Jun 1984

We make an analogy between images and statistical mechanics systems. Pixel gray levels and the presence and orientation of edges are viewed as states of atoms or molecules in a lattice-like physical system. The assignment of an energy function in the physical system determines its Gibbs distribution. Because of the Gibbs distribution, Markov random field (MRF) equivalence, this assignment also determines an MRF image model. The energy function is a more convenient and natural mechanism for embodying picture attributes than are the local characteristics of the MRF. For a range of degradation mechanisms, including blurring, nonlinear deformations, and multiplicative or additive noise, the posterior distribution is an MRF with a structure akin to the image model. By the analogy, the posterior distribution defines another (imaginary) physical system. Gradual temperature reduction in the physical system isolates low energy states (``annealing''), or what is the same thing, the most probable states under the Gibbs distribution. The analogous operation under the posterior distribution yields the maximum a posteriori (MAP) estimate of the image given the degraded observations. The result is a highly parallel ``relaxation'' algorithm for MAP estimation. We establish convergence properties of the algorithm and we experiment with some simple pictures, for which good restorations are obtained at low signal-to-noise ratios.

A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms

Article

Apr 2002

Stereo matching is one of the most active research areas in computer vision. While a large number of algorithms for stereo correspondence have been developed, relatively little work has been done on characterizing their performance. In this paper, we present a taxonomy of dense, two-frame stereo methods. Our taxonomy is designed to assess the different components and design decisions made in individual stereo algorithms. Using this taxonomy, we compare existing stereo methods and present experiments evaluating the performance of many different variants. In order to establish a common software platform and a collection of data sets for easy evaluation, we have designed a stand-alone, flexible C++ implementation that enables the evaluation of individual components and that can easily be extended to include new algorithms. We have also produced several new multi-frame stereo data sets with ground truth and are making both the code and data sets available on the Web. Finally, we include a comparative evaluation of a large set of today's best-performing stereo algorithms.

Occlusions and binocular stereo

Article

Apr 1995

Binocular stereo is the process of obtaining depth information from a pair of left and right cameras. In the past occlusions have been regions where stereo algorithms have failed. We show that, on the contrary, they can help stereo computation by providing cues for depth discontinuities. We describe a theory for stereo based on the Bayesian approach. We suggest that a disparity discontinuity in one eye's coordinate system always corresponds to an occluded region in the other eye thus leading to an occlusion constraint or monotonicity constraint. The constraint restricts the space of possible disparity values, simplifying the computations, and gives a possible explanation for a variety of optical illusions. Using dynamic programming we have been able to find the optimal solution to our system and the experimental results support the model.

Stereo by Intra- and Inter-Scanline Search Using Dynamic Programming

Article

Apr 1985

This paper presents a stereo matching algorithm using the dynamic programming technique. The stereo matching problem, that is, obtaining a correspondence between right and left images, can be cast as a search problem. When a pair of stereo images is rectified, pairs of corresponding points can be searched for within the same scanlines. We call this search intra-scanline search. This intra-scanline search can be treated as the problem of finding a matching path on a two-dimensional (2D) search plane whose axes are the right and left scanlines. Vertically connected edges in the images provide consistency constraints across the 2D search planes. Inter-scanline search in a three-dimensional (3D) search space, which is a stack of the 2D search planes, is needed to utilize this constraint. Our stereo matching algorithm uses edge-delimited intervals as elements to be matched, and employs the above mentioned two searches: one is inter-scanline search for possible correspondences of connected edges in right and left images and the other is intra-scanline search for correspondences of edge-delimited intervals on each scanline pair. Dynamic programming is used for both searches which proceed simultaneously: the former supplies the consistency constraint to the latter while the latter supplies the matching score to the former. An interval-based similarity metric is used to compute the score. The algorithm has been tested with different types of images including urban aerial images, synthesized images, and block scenes, and its computational requirement has been discussed.

Stochastic Relaxation, Gibbs Distributions,and the Bayesian Restoration of Images

Article

Jan 1984

Donald Geman

Stereo correspondence with slanted surfaces: critical implications of horizontal slant

Abstract and Figures

Recommended publications

Uniform approximation of continuous functions by smooth functions with no critical points on Hilbert...

A study of relation between blur and depth in stereoscopic images

A Projective Chirp Based Stair Representation and Detection from Monocular Images and Its Applicatio...

Can We Consider Central Catadioptric Cameras and Fisheye Cameras within a Unified Imaging Model