A 3-D MODEL EXTRACTION SYSTEM
Robert L. Russell, Richard A. McClain, and B. Patrick Landell
GE Aerospace Advanced Technology Laboratories
Moorestown, NJ 08057
ABSTRACT
This paper describes PolyFit, a 3-D feature extraction system which allows a user to interactively extract three
dimensional models from photographs with little apriori information. PolyFit's algorithm for simultaneously
determining the camera parameters and scene geometry is a nonlinear least squares optimization. The computed
geometry and camera information enables photographic texture extraction from the source imagery and subsequent
rendering of the scene geometry from arbitrary view points. The PolyFit user interface provides tools which
streamline the model building process as well as means for model inspection and exploitation. PolyFit has been
shown to provide a 10 to 1 productivity improvement over previous manual methods.
Keywords: Computational geometry, 3D Feature Extraction, computer image generator (CIG), photo-texture
extraction, camera modeling, object and scene modeling, mensuration.
1. INTRODUCTION
Training through computer image generation,
systems today are challenged to provide the most
photo-realistic renditions of real life environments [1].
The systems which provide high speed photo-
realistic rendering require accurate models of the
world accompanied by precisely registered photo-
texture. The cost, however, of developing the
databases required for this realism can be
staggering. This high cost is a direct result of the
time and manpower currently needed to generate a
database of any significant scale [2,4]. This long
database construction time also limits the use of
simulation systems for applications such as mission
planning or rehearsal because timely use of recent
photo-reconnaissance imagery is not generally
achievable [3].
One aspect of database development which is
particularly tedious is the modeling of architecture
within the gaming area. In the past this has been
accomplished by manual photointerpretation
techniques. Modelers would attempt to extract the
geometry of a given building by trial and error using
the available imagery as reference and inputting the
computer description manually. Scale would be
estimated from visual cues such as the height of a
doorway or the length of a recognized vehicle. For
buildings exhibiting simple geometry this technique
worked well enough. However, as the complexity of
the building increased, the accuracy of the model
decreased. Imagine trying to extract the angle
between two edges (other than 90 degrees) in a 2-D
image taken from an oblique perspective.
Furthermore, placing the building in the database
would require more trial and error by the modeler to
determine the relative location of this building with
respect to its neighboring buildings.
A secondary problem with previous manual
approaches is that the resulting models are not
registered to the imagery. Thus, to extract
photographic texture from the image, the computed
object wireframe would be interactively manipulated
to approximate the orientation, position and scale of
446
the object within the image; a very time consuming
process.
Previous approaches to speeding this object
modeling process have often made the assumption
that the available imagery already has associated
camera model information registered to terrain
elevation data. These approaches then attack the
problem by letting the operator place an object in the
world by manipulating a wireframe over the image of
the object. The camera model is used to determine
the object's scale, though A. Hanson et. al [2] also
used solar illumination geometry to better determine
object height for near nadir views. These previous
approaches are limited because: 1) they can only
handle simple geometries, 2) they rely on the
existence of supplementary data and 3) they don't
optimally fuse the information from multiple images in
a single 'best fit' of the object. This fusion aspect will
be further explained in later sections.
2. POLYFIT OVERVIEW
The system described in this paper has been
designed to overcome the above mentioned
limitations. This system, referred to as PolyFit,
extracts complex 3-D models from single or multiple
photos with little apriori information. Image camera
models, if not provided, are computed along with the
object models. f maps or control points for the
images are available PolyFit can locate the models
precisely in the world, otherwise, the database is
defined relative to a user definable local coordinate
system. Furthermore, PolyFit achieves high
accuracy by fusing the information from all sources
into one best fit solution. The PolyFit solver uses a
constraint elimination procedure and the Gauss-
Newton algorithm to solve the constrained nonlinear
optimization. Upon solution the 3-D models are
registered within the imagery allowing the photo-
texture to be easily extracted and orthorectified for
convenient access by the CIG. Using PolyFit's own
rendering capability allows: 1) verification of
geometry, 2) inspection of photo-texture and 3)
examination of the model from arbitrary vantage
points.