XXII ISPRS Congress 2012: Technical Commission III

  
A detailed description of the developed algorithms is presented 
in Section 2 of this paper. Results obtained on synthetic and real 
dataset are shown in Section 3. 
2. ITERATIVE POSE ESTIMATION ALGORITHM 
BASED ON CONTOUR TEMPLATES 
2.1 Distance between image and projected contours 
Assuming approximate initial pose estimate is available, the 
model is projected onto the image plane, and a contour 
representation of the projection if obtained. This contour, which 
is one pixel thick, is called contour template. The pose 
estimation problem consists in finding pose parameters (camera 
position specified by three rotation angles and three 
coordinates) which minimize the difference between the 
contour template and the image. In this work we present an 
iterative algorithm for pose estimation based on distances from 
the model's projected contours and image edges. 
The first step is to obtain the projected contour of the model. 
Since the 3D model of the object is known, this task is 
performed using z-buffering and rasterization algorithms. Next, 
in each point belonging to the projected contour we compute 
the distance from the point to the nearest edge on the image. 
The search for the nearest edge is performed along the line 
parallel to the vector normal at the current contour point. Only 
edges which are parallel to the projected contour are 
considered. To achieve this, the image is convolved with a bank 
of filters which accentuate edges of particular direction. In this 
work we quantize the search directions into four bins 
corresponding to 0?, 45?, 90? and 135?. The video frame is thus 
convolved with four 7x7 filters, with kernels equal to the 
Gaussian derivative along the x-coordinate rotated in the 
direction 6 (Geusebroek, J.-M., 2003). Various parts of ISS are 
accentuated by different directional filters. 
The contour-based pose estimation algorithm originates from 
the works of Lowe (Lowe, D. G., 1991) and (Vacchetti, L., et 
al, 2004). For each point (x,,y,) located on the contour 
template we inspect a set of image points, 
X,(3,,3,5,0) » (x, * jcos0, y,- sin),  je[-R,R]. The 
points are located on both sides of the contour, on the straight 
line L, which is perpendicular to the contour in the current 
point. The radius of the search neighborhood along the line is 
denoted by R. In every point X,(x,, y,;0) the absolute value 
of the previously computed directional derivative along 
direction Q is analyzed. Points where these values exceed a 
threshold are stored into a list. For every point (x,,y,) from 
the list, we compute the signed distance from (x,,y,) to the 
line specified by the normal vector with coordinates 
(cos0,—sin0) passing through the point (x,, y,) (the line is 
tangential to the projected contour). We use the screen 
coordinate system in which the y-axis points downwards and (0, 
0) corresponds to the upper left corner of the image. The search 
for the nearest edge is illustrated in Figure 2; the projected 
contour is drawn by a dashed line. The signed distance is 
computed as 
d, 2 (x, —x,)cos0 - (y, - ,)sinQ . (1) 
The distance is positive if the points (x,,y,) and (0,0) are 
located in the same half-plane with respect to the straight line; 
otherwise it is negative. 
The misfit function is the sum of all squared reprojection errors, 
i.e. distances between all projected contour points to the closest 
corresponding image edges: 
F(m)- |a(m)| . Q) 
The vector m= lo, gr. T contains the pose parameters, 
three rotation angles and three camera coordinates that are to be 
found. 
  
A x= ^ 
us 
Figure 2. Search for the closest edge 
2.2 Expressions for contour-based distance and its 
derivatives 
Before starting the process of iterative pose estimation using the 
projected contour template, it is necessary to find the three- 
dimensional coordinates of points whose projections belong to 
the contour. The “inverse projections” of the points lying on the 
contour are found by tracing the imaginary rays coming from 
the camera through the image plane and finding their 
intersection with the 3D model, assuming that pose parameters 
are known. In this way, every contour point is represented by a 
set of its 2D pixel coordinates (x,,y,) and 3D coordinates 
(X,,Y,Z,). Let us assume that the camera (or observer) 
coordinates are given by T=(T,T,,T.} The angles of 
rotation around the coordinate axes are w@, @ and x. The 
pixel coordinates (x,y)of a point with homogeneous 
coordinates x =(X,Y,Z,1)" can be found according to 
X=u/t 
3 
y=vit 3) 
[u, v, w,t]" 2 PVx, (4) 
R^ -R'T 
V= ; (5) 
0 0 0 1 
wh 
The 
ang 
Scr 
ana 
obt 
coc 
wh 
sig 
2.3 
In 
tha 
r= 
use 
Th 
cot 
inti 
At 
acc 
F( 
the 
res 
unt 
fen 
Fal 
hig 
COI
1
2
...
19
20
21
22
23
...
586
587
Full text: Technical Commission III (B3)

Access restriction

Copyright

Note to user