nbul 2004
ges from a
1) show the
(b) (c)
Figure 7: The result of the proposed algorithm on the input se-
quence shown in figure 6. Image (a) contains the pixels, which
received the most votes. Image (b) shows the pixels, which re-
ceived the least votes and therefore contains all the occlusions
combined. In image (c) pixels for which no unique decision could
be made are marked white.
International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Vol XXXV, Part BS. Istanbul 2004
Figure 9: The result of the image fusion with the input images
shown in figure 8. The bottom picture shows the final rectifica-
tion onto the facade's rectangle. The statue has completely disap-
peared and most of the pedestrians were suppressed. The bottom
picture shows the final rectification onto the facade's rectangle.
Note the improvement to each of the rectified images in figure 8.
exactly the same location in the images when all images are taken
from a single station. To reveal the occluded part of the facade
additional viewpoints have to be included. In the top row of figure
8 a sequence of three images taken from three different stations is
shown. The images depict the facade of a historic building, which
is occluded by moving pedestrians in its lower portion. Addition-
ally it is occluded by a stationary statue that extends across the
full height of the facade. Obviously overlaying these images will
not yield a per-pixel registered image stack. But since the back-
ground we are interested in is a facade, which is assumed to be
planar, a perspective transformation can warp the images into a
common projection.
One possibility to derive a proper transformation is to determine
the final rectification of each input image individually. To achieve
this it is necessary to mark the four corner points of the rectan-
gular portion of the facade. The second row of figure 8 shows
the result of such a rectification. We can observe how the statue
seems to move across the image. This example makes it obvi-
ous that we have successfully transformed the problem of a static
object to the problem of a moving object, which we have solved
earlier.
However marking the four corner points of the facade is a non-
optimal choice, since the image measurements are imprecise. Fur-
thermore it cannot be guaranteed that all corner points are visible
in every image. An alternative to compute the transformation is
to warp the images to a different plane than to the facade. Ac-
tually any plane could be used. A proper choice is to warp the
images to the same view as an arbitrarily selected key frame. In
effect this transforms the images to the plane of the image sensor
of the selected key frame.
To compute this transformation four corresponding points have
to be measured for each input image. The bottom row of figure
8 shows the three images warped to the view of the last image
of the sequence. These images form a per-pixel registered image
stack and can therefore be processed with the method we have
introduced in section 3. The result of the method is shown in
figure 9. The statue has completely disappeared and the facade
is free of most occlusions. Slight inaccuracies visible as blurring
are caused by the non-planarity of the façade.