te
S.
n
ge
)e
OT
el
10
al
al
th
al
al
229
Minimizing the residual sum of squares RSS of the intensity differences RSS = 3°, , ew (13 — h[I5])? — min
of the two image windows leads to the least squares procedure of area based matching. The linearized model
(2) describes one row of the observation equation referring to one pixel (z;, y;) of the matching window. Taking
all pixels of the matching window into account the normal equations are calculated and solved. The estimates
dà; are the corrections for the approximate values of the geometric and radiometric parameters. With the
corrected approximate values a; + da; a new linearization is determined and the iteration sequence proceeds
until convergence is reached.
For the generalization of the area based matching model of intensity images to multichannel images we use
vector valued image functions. Image I; is a N-channel image, for example, a RGB image with the channels
IE, 12, IP. The same number and type of channels is supposed for image I3. We define the nonlinear model
for multichannel matching by
I1(z,y) = h(I2(p, ¢)] + n(z,y) (3)
with
p(z,y) = ao+aix+ay
q(z,y) = a +ax+ay
h(Ij) = a6+2;15 + A742; -
Again the affine transformation for mapping between the two image patches is used now with one common set
of geometric parameters for all channels j = 0, N — 1. The radiometric adjustment is modelled individually
for each channel, thus the number of parameters increases from 6 + 2 in the classical case to 6 + 2N in the
multichannel case. The noise for all pixels and all channels is assumed with zero mean and variance c2; — c2.
Correlated noise between the channels or between the pixels is not taken into account even though this is an
oversimplification. The iteration procedure then is identical to the one sketched above for the single channel
case. In the implementation of the algorithm the design elements of the linearized model are derived from
both images. Basically this implies averaging of I and I} and of the gradient images VI? and VÀ taking the
geometric and radiometric transformations into account.
The theoretical precision of multichannel matching follows from a simple generalization of the single channel
case. For simplicity of the argumentation we restrict the image to image transformation of (3) to a shift between
the windows of concern. Then the normal equations are given by
XR TT (EEG LATE EME AD d
Y Y (t B3; gy zs S (25 y däz SIS (2) AI
By multiplying the inverse of the normal equation matrix with the estimated noise variance 02 = RSS/ (M — 8)
(resp. M —6—2N in the multichannel case proposed above) the covariance matrix is obtained. The precision of
the matching depends mainly on three factors (assuming a small covariance between the two shift parameters):
The number of pixels in the matched window, the texture measured by the mean squared gradient in the
window and the noise variance (Forstner, 1982). Essentially the dependency can be directly observed from (4).
The sum over all channels in equation (4) indicates that each additional channel contributes theoretically to
improve the precision. If the gradients in one of the channels dominate then a multichannel solution close to
the single channel solution of the dominant channel can be expected. With low texture in one of the channels
the matching relying on this channel might be not successful. The theoretical superiority of the simultaneous
multichannel solution becomes obvious in this case because the information from the other two channels should
lead to a successful matching.
In this sense the proposed algorithm fulfills the demand of enlarging the peephole to the multispectral domain.
Expanding the spatial, spectral, temporal, and contextual peephole will lead to more competent computer vision
systems (Strat, 1994).
IAPRS, Vol. 30, Part 5W1, ISPRS Intercommission Workshop “From Pixels to Sequences”, Zurich, March 22-24 1995