当前位置：文档库 › Segment-Based Stereo Matching Using Belief Propogation and a Self-Adapting Dissimilarity Measure

Segment-Based Stereo Matching Using Belief Propogation and a Self-Adapting Dissimilarity Measure

Segment-Based Stereo Matching Using Belief Propagation and a Self-Adapting

Dissimilarity Measure1

Andreas Klaus,Mario Sormann and Konrad Karner

VRVis Research Center,Graz,Austria

{klaus,sormann,karner}@vrvis.at

Abstract

A novel stereo matching algorithm is proposed that uti-lizes color segmentation on the reference image and a self-adapting matching score that maximizes the number of re-liable correspondences.The scene structure is modeled by a set of planar surface patches which are estimated using a new technique that is more robust to outliers.Instead of assigning a disparity value to each pixel,a disparity plane is assigned to each segment.The optimal disparity plane labeling is approximated by applying belief propaga-tion.Experimental results using the Middlebury stereo test bed demonstrate the superior performance of the proposed method.

1.Introduction

Stereo matching continues to be an active research area as is proven by a large number of recent publications ded-icated to this topic[1,2,4,6,9].The goal is to determine disparities that are indicating the difference in locating cor-responding pixels.The recovery of an accurate disparity map still remains challenging,mainly due to the following reasons:

(i)Pixels of half occluded regions do not have correspon-dences in the other image,leading to incorrect matches if not taken into account.

(ii)Images are disturbed because of sensor noise.This is especially problematic in poorly textured regions due to the low signal-to-noise-ratio(SNR).

(iii)The constant brightness or color constraint is only sat-is?ed under ideal conditions that can only roughly be met in practice.

A comprehensive overview on stereo matching can be found in[8].In general matching algorithms can be classi?ed into 1This work has been done in the VRVis research center,Graz/Austria (http://www.vrvis.at/2d3d),which is partly funded by the Austrian govern-ment research program Kplus.local and global methods.Local approaches are utilizing the color or intensity values within a?nite window to de-termine the disparity for each pixel.Global approaches are incorporating explicit smoothness assumptions and are de-termining all disparities simultaneously by applying energy minimization techniques such as graph cuts[2,4,6,7],be-lief propagation[5,9],dynamic programming,scan-line op-timization or simulated annealing.

Recently,segment-based methods[1,2,4,6,10]have at-tracted attention due to their good performance.They are based on the assumption that the scene structure can be ap-proximated by a set of non-overlapping planes in the dis-parity space and that each plane is coincident with at least one homogeneous color segment in the reference image. Segment-based methods generally perform four consecu-tive steps that are illustrated in Figure1.First,regions of homogeneous color are located by applying a color seg-mentation method.Second,a local window-based matching method is used to determine disparities of reliable points. Third,a plane?tting technique is applied to obtain dispar-ity planes that are considered as a label set.Fourth,an op-timal disparity plane assignment(optimal labeling)is ap-proximated using greedy[1,10]or graph cuts[2,4,6]opti-mization.Despite our method shares the same consecutive steps,there are three main distinguishing features:

First,a self-adapting dissimilarity measure is used to in-crease the number of reliable correspondences as is ex-plained is Section3.Second,a novel outlier insensitive ap-proach is applied to extract the disparity planes(Section4). Third,the labeling problem is solved using belief propaga-tion(Section5).These features represent the main contri-bution of our approach which results in superior matching quality(demonstrated in Section6).

2.Color segmentation

The?rst step in the work-?ow is to decompose the ref-erence image into regions of homogeneous color or gray-scale.The algorithm assumes that disparity values vary smoothly in those regions and that depth discontinuities

Figure1.Block diagram of segment-based

stereo matching algorithms augmented with

input data,intermediate and?nal results of

the proposed method.

only occur on region boundaries.Over-segmentation is preferred,since it helps to meet this assumption in prac-tice.Therefore mean-shift color segmentation recently suc-cessfully applied to image segmentation by Comaniciu and Meer[3]is used.The mean-shift analysis approach is es-sentially de?ned as a gradient ascent search for maxima in a density function de?ned over a high dimensional fea-ture space.The feature space include a combination of the spatial coordinates and all its associated attributes that are considered during the analysis.The main advantage of the mean-shift approach is based on the fact that edge informa-tion is incorporated as well.

3.Local matching in pixel domain

In the proposed method the scene structure is modeled

by a set of planar disparity planes.A disparity plane

is speci?ed by the three parameters c1,c2,c3that deter-mine a disparity d for each reference image pixel(x,y):

d=c1x+c2y+c3

Due to the huge number of possible disparity planes the

number is reduced by extracting a set of disparity planes

that is suf?cient to represent the scene structure.This is

done by applying local matching in the pixel domain fol-

lowed by a disparity plane estimation step.

Local matching requires to de?ne a matching score and an

aggregation window[8].The most common dissimilarity

measures are squared intensity differences(SD)and ab-

solute intensity differences(AD)that are strictly assuming

the constant color constraint.Other matching scores such as

gradient-based and non-parametric measures are more ro-bust to changes in camera gain and bias or non-Lambertian surfaces at the cost of a low discriminating power.In our approach we are using a self-adapting dissimilarity measure that combines sum of absolute intensity differences(SAD) and a gradient based measure that are de?ned as follows:

C SAD(x,y,d)=

(i,j)∈N(x,y)

I1(i,j)?I2(i+d,j)

and

C GRAD(x,y,d)=

(i,j)∈N x(x,y)

|?x I1(i,j)??x I2(i+d,j)|+

(i,j)∈N y(x,y)

|?y I1(i,j)??y I2(i+d,j)|,

where N(x,y)is a3×3surrounding window at position (x,y),N x(x,y)a surrounding window without the right-most column,N y(x,y)a surrounding window without the lowest row,?x the forward gradient to the right and?y the forward gradient to the bottom.Color images are taken into account by summing up the dissimilarity measures for all channels.

An optimal weightingωbetween C SAD and C GRAD is de-termined by maximizing the number of reliable correspon-dences that are?ltered out by applying a cross-checking test (comparing left-to-right and right-to-left disparity maps)in conjunction with a winner-take-all optimization(choosing the disparity with the lowest matching cost).The resulting dissimilarity measure is given by:

C(x,y,d)=(1?ω)?C SAD(x,y,d)+ω?C GRAD(x,y,d) Furthermore we are utilizing the reliable correspondences to predict the SNR that is used to normalize our dissimilarity measure.Because of the normalization a?xed truncation threshold can be set right above the noise level to obtain a robust matching score.

4.Disparity plane estimation

The reliable correspondences are used to derive a set of disparity planes that are adequate to represent the scene structure.This is achieved by applying a novel robust plane ?tting method and a consecutive re?nement step.

Robust plane?tting Despite only reliable disparities of each segment are used to derive a corresponding disparity plane,the estimated plane may be disturbed due to remain-ing outliers.A straightforward way to determine the dispar-ity plane parameters is to solve a least square system.As is generally known least square solutions are very sensitive to outliers and that linear or median solutions are much more robust.

Our method determines a robust solution by applying a decomposition method to solve each parameter separately.

First,the horizontal slant is estimated using a set of all com-binations of reliable disparities that are lying in the same horizontal line within the segment.The derivationsδd/δx are inserted to a list and a robust estimation of the horizontal slant is determined by sorting the list and applying convo-lution with a Gaussian kernel.

Second,the vertical slant is estimated in a similar manner by considering all combinations lying on the same vertical line.

Third,the determined slant is used to obtain a robust es-timate of the disparity value in the center of the segment. Therefore corresponding center disparities for each reliable point,that are calculated by considering the estimated slant, are inserted to a list and a robust estimate is obtained as explained before.

Disparity plane re?nement The purpose of this step is to increase the accuracy of the disparity plane set by repeating the plane?tting for grouped regions that are dedicated to the same disparity plane.Similar as in[6]the following steps are processed:

First,a matching cost is calculated for each segment-to-plane assignment.It is computed by summing up the match-ing cost for each pixel inside the segment S:

C SEG(S,P)=

(x,y)∈S C(x,y,d),

where P is a disparity plane that de?nes disparity d. Second,the disparity plane with the minimum matching cost is assigned to each segment.Third,segments that are assigned to the disparity plane are grouped.Finally the plane estimation is repeated for all grouped segments.

5.Disparity plane assignment

In the?nal step an optimal solution for the segment-to-disparity plane assignment is searched.Therefore the stereo matching is formulated as an energy minimization problem for the labeling f that assigns each segment s∈R a corre-sponding plane f(s)∈D.The energy for a labeling f is given by:

E(f)=E data(f)+E smooth(f),

where

E data(f)=

s∈R C SEG(s,f(s))

and

E smooth(f)=

(?(s i,s j)∈S N|f(s i)=f(s j))

λdisc(s i,s j).

S N represents a set of all adjacent segments and λdisc(s i,s j)is a discontinuity penalty that incorporates the common border lengths and the mean color similarity as proposed in[2].

An optimal labeling with minimum energy is approximated using Loopy Belief Propagation[5]where the message passing takes place between adjacent segments.6.Experimental results

The proposed method was evaluated using the Middle-bury test bed(https://www.wendangku.net/doc/ba9287162.html,/stereo/)provided by the authors of[8].Qualitative results are shown in Figure 2.Quantitative results of the ten best performing methods are given in Table1,where the percentage of pixels with an absolute disparity error greater than one pixel are shown for different regions:non-occluded(nonoccl.),whole image (all)and pixels near discontinuities(on disc.).Our method processed all four stereo pairs with a?xed parameter set and was ranked at the?rst place.The calculation on a2.21GHz Athlon64computer takes between14and25sec.,whereas the mean-shift segmentation is the most time consuming step.

7.Conclusions

A new segment-based stereo matcher has been intro-duced.The conjunction of color segmentation,a self-adapting matching score,a robust plane?tting technique as well as BP-optimization yields excellent results as demon-strated on the Middlebury stereo evaluation test bed.

References

[1]M.Bleyer and M.Gelautz.A layered stereo matching al-

gorithm using image segmentation and global visibility con-straints.

[2]M.Bleyer and M.Gelautz.Graph-based surface reconstruc-

tion from stereo pairs using image segmentation.In SPIE, pages vol.5665:288–299,January2005.

[3] https://www.wendangku.net/doc/ba9287162.html,aniciu and P.Meer.Mean shift:A robust approach

toward feature space analysis.IEEE:PAMI,24(5):603–619, May2002.

[4]Y.Deng,Q.Yang,X.Lin,and X.Tang.A symmetric

patch-based correspondence model for occlusion handling.

In ICCV,pages II:1316–1322,2005.

[5]P.F.Felzenszwalb and D.P.Huttenlocher.Ef?cient belief

propagation for early vision.In CVPR,pages I:261–268, 2004.

[6]L.Hong and G.Chen.Segment-based stereo matching using

graph cuts.In CVPR,pages I:74–81,2004.

[7]V.Kolmogorov and https://www.wendangku.net/doc/ba9287162.html,puting visual correspon-

dence with occlusions via graph cuts.In ICCV,pages II: 508–515,2001.

[8] D.Scharstein and R.Szeliski.A taxonomy and evaluation

of dense two-frame stereo correspondence algorithms. [9]J.Sun,Y.Li,S.Kang,and H.Shum.Symmetric stereo

matching for occlusion handling.In CVPR,pages II:399–406,2005.

[10]H.Tao,H.S.Sawhney,and R.Kumar.A global matching

framework for stereo computation.In ICCV,pages I:532–539,2001.

Algorithm Avg.Tsukuba Venus Teddy Cones

Rank nonoccl.all on disc.nonoccl.all on disc.nonoccl.all on disc.nonoccl.all on disc. Proposed Method 1.7 1.113 1.372 5.7930.1010.211 1.441 4.2227.06211.82 2.4817.9217.321 Double-BP 2.30.881 1.291 4.7610.1420.605 2.003 3.5518.7139.701 2.9039.2447.802 Segm+visib[1] 5.1 1.306 1.573 6.9280.798 1.066 6.769 5.003 6.54112.33 3.7258.62310.26 SymBP+occ[9] 5.10.972 1.755 5.0920.1630.332 2.194 6.47610.7417.07 4.791110.7810.97 C-SemiGlob 6.2 2.6116 3.29119.89140.2550.573 3.245 5.14411.8513.04 2.7728.3528.203 RegionTreeDP7.0 1.399 1.644 6.8560.2240.573 1.9327.42911.9616.86 6.311411.91211.89 AdaptWeight7.3 1.388 1.856 6.9070.716 1.197 6.1377.881013.3918.611 3.9779.7968.264 SemiGlob9.3 3.2617 3.961412.819 1.009 1.57811.314 6.02512.2716.35 3.0649.7558.905 RealtimeBP10.4 1.4910 3.40137.87100.777 1.90119.00138.721313.2817.28 4.61911.61012.413 Layered11.4 1.5711 1.8778.2811 1.3411 1.859 6.85108.641214.31018.510 6.591614.71514.415 GC+occ11.5 1.194 2.019 6.244 1.6414 2.1913 6.75811.21617.41619.814 5.361312.41313.014 MultiCamGC12.0 1.275 1.998 6.485 2.7919 3.1317 3.60617.01717.61722.016 4.891211.81112.111 Table1.Middlebury stereo evaluation on different algorithms,ordered according to their overall performance.The subscript numbers indicate the rank of each method in each column.

reference images ground truths our results’bad pixel’maps Figure2.Results using the Middlebury datasets:Tsukuba,Venus,Teddy and Cones.Pixels with a disparity error greater than one pixel are displayed in the’bad pixel’maps,where missmatches in non-occluded areas are indicated in black,in occluded areas in gray color.