Full text: Papers accepted on the basis of peer-reviewed full manuscripts (Part A)

In: Paparoditis N., Pierrot-Deseilligny M.. Mallet C.. Tournaire O. (Eds). IAPRS. Vol. XXXVIII. Part ЗА - Saint-Mandé, France. September 1-3, 2010 
2.2 Strueture-froin-Motion 
Stmcture-from-Motion (Lhuillier, 2008a) is applied to estimate 
geometry (camera poses and a sparse cloud of 3D points) using 
the calibration initialization of Section 2.1: estimate the 2-view 
and 3-view geometries of consecutive images from matched Har 
ris points (step 1) and then estimate the whole sequence geometry 
using bundle adjustment (BA) applied in a hierarchical frame 
work (step 2). Another BA is applied to refine simultaneously 
radial distortion parameters and the 3D assuming that the radial 
distortion is constant in the whole sequence (step 3). 
Although it is important to get a maximum number of recon 
structed features for 3D scene modeling, w'e noticed that there 
are many more 3D points than needed to initialize the geometry 
in our w ide view field context. Indeed, this is not uncommon to 
have more than 2000 features per outdoor image involved in BA, 
and this implies a waste of computation time. So the number ??/ 
of features per image is limited to 500 in all BAs of steps 1-2-3: 
3D points are randomly selected and removed while rif is larger 
than 500 in all images. In practice, this simple scheme holds a 
good point distribution in the view field. The 4 th step is the fol 
lowing: step 3 is applied a second time without iif limit to get 
a maximum number of reconstructed features consistent with the 
poses and calibration. 
Our BA is the sparse Levenberg-Marquardt method assuming that 
there are more structure parameters than camera ones: it includes 
profile Choleski decomposition (Triggs et al., 2000) of the re 
duced camera system. 
2.3 Drift Removal 
Drift or error accumulation is unavoidable in the geometry esti 
mation of long sequence. Methods (Havlena et ah. 2009, Anan 
and Hartley, 2005, Comelis et ah, 2004) detect the drift between 
two reconstructed images ? and j if these images are taken at 
similar locations. These methods also provide list Li,j of point 
matches between i and j, which is used to remove drift. Without 
drift removal, scene part visible in i and j is reconstructed twice. 
Adequate BA and its initialization are applied to remove recon 
struction duplicates while maintaining low re-projection errors in 
the whole sequence of images {0.1 • • • n — 1}. Once the 2D fea 
ture match list Lij is given for pair {?', j}, we remove the drift 
between i and j as follow's. First, w ; e choose integer k such that 
the neighborhood Af(?) of i is the list {? — k ■ ■ ■ i • • • i 4- k}. Sec 
ond, j\T(i) and its data (3D geometry and image features) are du 
plicated in images Af(n + k) = {?? • • • r?. + 2A’} such that images 
n + k and i are the same. Third, we use RANSAC to fit the simi 
larity transformation s of 3D points matched by Lij and apply s 
to obtain Af(n + k) geometry in the same basis as {0 • • • n■— 1} 
geometry. Fourth, {0 • • • n + 2k} geometry is refined by BA tak 
ing into account L n +k,j (L n+ k,j is a copy of Lij with image 
index changes). Now Af (n 4- k) geometry is the drift correction 
of Af{i) geometry. Fifth, constrained bundle adjustment (CBA) 
is applied to minimize the global re-projection error subject to 
constraint c(x) = 0, where x concatenates 3D parameters of 
{()•■• n + 2k} and c(x) concatenates the drifts between poses 
of Af(i) and Af{n + k) (more details in Appendix B). At this 
point, the drift between ? and j is removed but Af(n + k) is re 
dundant. Last, we remove data involving Af (n + k) and apply 
BA to {0 • • • n — 1} geometry by taking into account Lij. 
This scheme is applied using a limit of ??/ = 500 features per 
image to avoid w’aste of time as in Section 2.2, with the only 
difference that Lj, ? and L n +k,j are not counted by this limit. 
2.4 Over-Segmentation Mesh 
This mesh has the follow ing purposes. It makes image-based sim 
plification of the scene such that the view field is uniformly sam 
pled. This is useful for time and space complexities of further 
processing and more adequate than storing depth maps for all im 
ages. Furthermore, it segments the image into polygons such that 
depth discontinuities are constrained to be at polygon borders. 
These borders are selected among image contours. If contours 
are lacking, borders are preferred on concentric circles or radial 
segments of the donut image. This roughly corresponds to hori 
zontal and vertical depth discontinuities for standard orientation 
of the catadioptric camera (if its symmetry axis is vertical). 
In short, the image mesh is build in four steps: initialization 
checkerboard (row's are concentric rings, columns have radial di 
rections. cells are two Delaunay triangles), gradient edge inte 
gration (perturb vertices to approximate the most prominent im 
age contours), optimization (perturb all vertices to minimize the 
sum, for all triangles, of color variances, plus the sum, for all ver 
tices. of squared moduluses of umbrella operators), and polygon 
segmentation (triangles are regrouped in small and convex poly 
gons). In practice, lots of polygons are quadrilaterals similar to 
those of the initialization checkerboard. 
2.5 View-Centered 3D Model 
View-centered 3D model is build from image mesh (Section 2.4) 
assuming that the geometry is known (Sections 2.1, 2.2 and 2.3). 
Depth Map in the Reference Image We reproject catadioptric 
image onto the 6 faces of a virtual cube and apply match prop 
agation (Lhuillier and Quan, 2002) to two parallel faces of two 
cubes. The depth map in the i ih image is obtained by chaining 
matches between consecutive images of Af (?). In the next steps, 
the over-segmentation mesh in the i ih image is back-projected to 
approximate the depth map. 
Mesh Initialization For all polygons in image i, a plane in 3D 
(or nil if failure) is estimated by a RANSAC procedure applied 
on depths available inside the polygon. A depth is inlier of ten 
tative plane if the corresponding 3D point is in this plane up to 
thresholding (Appendix A). The best plane 7r defines 3D points 
w'hich are the intersections between n and the observation rays of 
the polygon vertices in the i th image. These 3D points are called 
“3D vertices of polygon" although the polygon is 2D. 
For all edges e in image i, we define boolean b e which will deter 
mine the connection of triangles in both edge sides. Since depth 
discontinuity is prohibited inside polygons, we initialize b e = 1 
if both triangles are in the same polygon (other cases: b e = 0). 
Connection Connections between polygons are needed to ob 
tain a more realistic 3D model. Thus edge booleans are forced to 
1 if neighboring polygons satisfy coplanarity constraint. For all 
polygons p with a plane in 3D, we collect in list L p the polygons 
q in p-neighborhood (including p) such that all 3D vertices of q 
are in the plane ofp up to thresholding (Appendix A). If the sum 
of solid angles of polygons in L p is greater than a threshold, we 
have confidence in coplanarity between all polygons in L p and 
we set b e = 1 for all edges e between two polygons of L p . 
Hole Filling We fill hole H if its neighborhood N is copla- 
nar. Both H and N are polygon lists. The former is a connected 
component of polygons without plane in 3D and the latter con 
tains polygons with plane in 3D. Neighborhood N is coplanar if 
there is a plane ir (generated by random samples of vertices of 
N) such that all 3D vertices in N are in 7r up to thresholding 
(Appendix A). If N is coplanar, all polygons of H get plane n 
and we set b e = 1 for all edges between tw'o triangles of H U N.
	        
Waiting...

Note to user

Dear user,

In response to current developments in the web technology used by the Goobi viewer, the software no longer supports your browser.

Please use one of the following browsers to display this page correctly.

Thank you.