OBJECT EXTRACTION FOR DIGITAL PHOTOGRAMMETRIC WORKSTATIONS
Helmut Mayer
Institute for Photogrammetry and Cartography, Bundeswehr University Munich, D-85577 Neubiberg, Germany
Helmut. Mayer@UniBw-Muenchen.de
KEY WORDS: Object, Extraction, Digital. Photogrammetry, Semi-automation, Modeling, Statistics, Test
ABSTRACT
This paper deals with the state and with promising directions of automated object extraction for digital photogrammetric
workstations (DPW). À review of the state of the art shows that there are only few success stories. Therefore, important
areas for a practical success are identified. À solid and most important powerful theoretical background is the basis. Here,
we advocate particularly statistical modeling. Testing makes clear which of the approaches are best suited and how useful
they are for praxis. A key for commercial success is user interaction, an area where much work still has to be done. As the
means for data acquisition are changing, new promising application areas such as extremely detailed three-dimensional
(3D) urban models for virtual television or mission rehearsal evolve.
1 INTRODUCTION
Digital photogrammetric workstations (DPW) (Heipke,
1995) have been introduced in the market on a larger scale
at the middle / end of the nineties and have become the
standard for photogrammetric processing. While tasks
with a high redundancy such as orientation have reached
a high degree of automation and robustness, this is only
partially the case where the redundancy is not so high such
as for the generation of digital surface models (DSM) or
digital elevation models (DEM). For the latter two, laser-
scanning has become an attractive alternative.
While it was a matter of a few decades to highly automate
the above tasks, the situation is much more difficult for au-
tomated object extraction. There are only few (semi-) auto-
mated systems which are used with success in the market.
(Baltsavias, 2004) cites most prominently the systems for
building extraction InJect of INPHO GmbH (Gülch et al.,
1999) and CC-Modeler of CyberCity AG (Grün and Wang,
2001). Additionally, the systems for road update and ver-
ification ATOMIR (Zhang, 2004) and WIPKA-QS (Gerke
et al., 2004) are on the verge of becoming operational.
This paper addresses reasons for this deficit, but also points
on issues we think are important to improve the situation
and introduce object extraction on a larger scale in practi-
cal applications.
Legend has it, that in the 19501es scientists from the field of
artificial intelligence thought, that the solution of the vision
problem was a matter of a graduate student project. This
estimation then shifted from five years to twenty years and
then to much much longer. Today, there is a large body
of knowledge in different fields as diverse as psychology
(Kosslyn, 1994) and the use of geometry in computer vi-
sion (Hartley and Zisserman, 2000), but still we might be
only at the beginning of understanding the basic problems.
There is still progress not only in the high level understand-
ing, i.e., interpretation, area, but also in the basic under-
standing of the image function. E.g., (Kóthe, 2003) has
shown that the well known operator (Fórstner and Gülch,
1987) does not take into account the frequency doubling
implicit in the squaring of the Hessian matrix (some peo-
ple also call it the structure tensor). The SIFT opera-
tor of (Lowe, 2004) offers scale and rotation invariant
features which can be robustly matched under affine dis-
tortion, noise, and illumination changes. (Pollefeys and
Van Gool, 1999, Pollefeys et al., 2002) have shown that
it 1s possible to fully automatically reconstruct the pose
and calibration of images of a camera of which the only
thing known is, that it is perspective. They also demon-
strated the importance of redundancy in matching, an is-
sue recently propagated by (Gruber et al., 2003) for ro-
bust DSM / DTM generation by means of digital aerial
cameras. In (Nistér, 2003) a direct solution for the five-
point relative orientation problem is given. Finally, the
test of (Scharstein and Szeliski, 2002) on stereo matching
has sparked a large number of new approaches for match-
ing, using, e.g., the powerful graph cut technique (Kol-
- mogorov and Zabih, 2001), or cooperative disparity esti-
414
mation (Mayer, 2003).
This paper rests on a recent survey (Baltsavias, 2004)
which summarizes important points for the practical use
of object extraction. We will not repeat the contents
of this survey, but rather deepen some points, vet giv-
ing enough overview of the area to make this paper self-
contained. We are mainly concerned with aerial imagery
and laser-scanner data. Although focusing on the former
two sources, we also deal with satellite imagery and other
data such as hyper spectral data or terrestrial video se-
quences and laser-scanner data. To limit the scope, we do
not consider radar data.
The prerequisite for productive object extraction is model-
ing (cf. Section 2), which in our case comprises also the
strategy, data sources including data from geographic in-
formation systems (GIS), statistics with and without ge-
ometry, and learning. While a lot of basic scientific work
ends at this point, there is a recent tendency to evaluate the
performance of the approaches, possibly also in compari-
son with each other, in different tests. We think that testing
as presented in Section 3 is a must for bringing object ex-
traction into praxis as it not only makes clear which mod-
els and strategies are superior to others, but it also shows
what is possible with object extraction. After a success-
Interna
ful tes
semi-a
and se
ments
an ide:
ularly -
paper «
Model
proach
tractioi
the ob)
is nece
relatioi
which
which
E 2€
centers
the car
The m
of ana
proach
will in
ber of
extract
by a de
of the
ably ex
23 8
Event
sary co
of ope:
and the
ficient,
that m
fortune
0.25 m
bright
space «
and nx
On the
roads 1
tion, i.
Then c
geneity
genera
tracted
eviden
We ter
tions c
exist ol