International Archives of the Photogrammetry, Remote Sensing
and Spatial Information Sciences, Vol XXXV, Part B3. Istanbul 2004
is usually quite good, contemporary cameras often give 16 bit
data with twelve bit information.
There are also disadvantages for such cameras: Usually they are
much more expensive than standard digital CCD video cameras.
If we take diffraction at the aperture as limit for the angular
resolution a lens for a thermal camera may have to be ten times
bigger than the equivalent lens for the visual camera. Also often
the detector has to be cooled down to very low temperatures.
Therefore thermal cameras are usually bigger and need more
energy than visual cameras. They also do not give any spectral
measurements like a colour CCD camera does. Some modern
thermal cameras have a focal plane array sensor but some
systems still have only a small number of sensors. These
cameras compose the image using moving mirror systems,
which gives special distortions in the image geometry.
Because of the lack of colour and because of frequent
appearance of non-structured homogenous regions with no
temperature differences thermal videos pose a more difficult
challenge to geometric estimation procedures. Therefore all our
examples are picked from this domain. The algorithms also
work for aerial videos of the visual spectral domain with colour
being an important feature for correspondence assessment.
2. ESTIMATING POSE FROM HOMOGRAPHIES
2.1 Interest Point Locations
It is not possible to localize correspondence between different
frames if the object is homogenous in that location. If an edge
or line structure is present at a location in the 2-d image array
there may still be an aperture problem. Secure point
correspondence can only be obtained at locations where a
corner, crossing or spot is present. It is proposed to use the
averaged tensor product of the gradient of the grey-values
(Fórstner, 1994). Interest locations are given where both eigen-
values of this matrix are non-zero.
2.2 Assessment of Correspondence
Correspondence between locations in different frames of a
video can be assessed using grey value correlation. Still there
may be problems with repetitive structures. However, there will
usually be a prior estimate for a location correspondence. Then
this might be used to assign regions of interest in one image to
interest locations in the other image or to form the overall
assessment as product of correlation and prior probability.
Regions of interest or the variance of priors will be quite narrow
for immediately successive frames of the video.
2.3 Planar Homographies
Given a perspective projection and a rigid planar scene the
movement of locations in the image is determined by a planar
homography x zx, where the image location correspondence
(xx ') is written in homogenous coordinates, H is a 3x3-matrix
and = means equality up to an unknown scale factor. Given a
set of at least four correspondences H can be estimated using
direct linear transformation (DLT) (Hartley; Zisserman, 2000).
We assume the inner camera parameters to be known and the 2-
d coordinates to be normalized such that the focal length equals
one and the origin is at the principal point. Shifting the principal
point has no major impact on the precision of the estimations.
But the influence of the focal len
oth is considerable: Either we
may do the estimation first and
transform to normalized
coordinates afterwards using
fo Wt hs : hy, h, M fh. ) (1)
hy, hy A|[—| ^ ^» V/ ha
E. fa My fh dh. hi )
Then we will enlarge errors on the projective elements h;, and
lh; respectively by factor f and diminish errors on the
translation elements 4,3 and 4,3 with the same factor.
Or we may do the transform on the image coordinates x and x'
dividing the first two components of them through / and go into
the DLT system with these smaller entries. This has a similar
effect: The equation system will not be balanced. Entrances
responsible for unknown variables hy; h;> h>; hz2 in the affine
section will be smaller than the entries for the unknown
translation elements /jj5 and /;; with approximately the same
factor / and for the unknown projective elements /15; and A132
respectively there will be very small entrances (factor f°).
2.4 Decomposition of Homographies
Given an estimate for the normalized planar homography H we
can reconstruct the pose parameters using the decomposition
HH = Rn), (2)
where R is an orthogonal rotation matrix, / is the translation of
the camera and # is the surface normal (Faugeras, 1995). This
representation sets the origin of the 3-d system into the centre
of the second camera. R contains three degrees of freedom that
may be extracted as successive rotation angles or as normalized
axis in 3-d and turning angle around it. The vectors 7 and ¢
together contain five degrees of freedom because n will be
normalized setting the distance of the second camera to the
plane to one, while / is a 3-d translation.
The absolute scale cannot be determined from the image
sequence alone. This requires additional information e.g. from
an altimeter or from a speed sensor.
In rural areas the plane will be a good approximation for the
ground plane. In urban areas most visible structure will result
from the roofs. So the plane will be at average roof height over
ground. The vector 7 will still be a good approximation to
zenith direction. We will not get information on the north-
direction from the images unless we rely on shadow and
daytime analysis. There will be no geo-reference from the
images as long as we have not recognized or matched objects
from the images to map objects.
We assume sufficient movement of the air-craft. This is
important, because the decomposition of homographies needs to
distinguish the translation-free case from mappings with
translated cameras.
The rotation free case: Often the camera will be mounted on a
stabilized platform or the camera rotation will be measured by
an inertial device giving much more precision than the
estimation from the camera May yield. This known rotation
may be applied as homography to the coordinates of the first
image and then we may assume R to be the identity. Then the
homography is restricted to be à central collineation with real
eigenvalues, which is either a planar homology or elation
( Beutelsbacher; Rosenbaum, 1998). Considering the homology
case first we may scale H such that the double eigenvalue
equals one. The corresponding 2-d eigenspace is the horizon
line. This is a straight line of fixed-points (the image of the
intersection of the plane n with the plane at infinity) and n also
gives its Hessian normal form. The other eigenspace is 1-d and
gives the epipole and translation rt. The eigenvalue
c
©
International Arc
corresponding t
normalized we ¢
This solution is u
The eigenvalue
correspondences
complex eigenv
rotation free c:
estimation. À hoi
the estimated hor
The elation cas:
elation if the epi
not an exceptic
manoeuvres (kee;
value with a c
horizon line). Th
eigen-spaces of
from pairs of €
correspondence :
pairs that are |
estimation (see S
the influence of
estimation / and s
becomes linear ı
unknown scaling
Hzl- ge
Dividing this eq
equations in the
singular value de
for all central col
homologies. It n
well.
With rotation:
homography is tr
value decomposit
V. Equation (2) is
t=Ut’ and n=Vn'.
Assuming the sn
matrix HA to be si
restricted to the Y
Zero:
h, 0 0 7 (
9 305
0-30. |
where all four ci
permitted. Transfc
obtain four solutio
À critical situation
the value in one
singular values ar
parallel i.e. the cre