4. DETAILS OF THE APPROACH
4.1 Image-Based Modeling
The approach is designed mainly for man-made objects such as
classical architectures, which are designed within constraints of
proportion and configurations. Classical buildings are divided
into architectural elements. These elements are logically
organized in space to produce a coherent work. There is a
logical hierarchical relation among building parts and between
parts and whole. The most common scheme divides the
building into two sets of lines forming a rectangular grid
[Tzonis and Lefaivre, 1986]. The distance between the grid
lines are often equal or when they vary, they alter regularly. The
grid lines are then turned into planes that partition the space and
control the placement of the architectural elements. The
automation of 3D reconstruction is better achieved when such
understanding is taken into account. We will reconstruct the
architecture elements from minimum number of points and put
them together using the planes of a regular grid. Other schemes,
such as a polar grid, also exist but the basic idea can be applied
there too. Classical architecture can be reconstructed, knowing
its components, even if only a fragment survives or seen in the
images. For example, a columnar element consists of: 1) The
capital, a horizontal member on top, 2) the column itself, a long
vertical tapered cylinder, 3) a pedestal or a base on which the
column rests. Each of those can be further divided into smaller
elements. In addition to columns, other elements include pillars,
pilasters, banisters, windows, doors, arches, and niches. Each
can be reconstructed with a few seed points from which the rest
of the element is built.
Our approach is photogrammetry-based. The approach does not
aim to be fully automated nor completely rely on human
operator. It provides enough level of automation to assist the
operator without sacrificing accuracy or level of details. Figure
2 summarizes the procedure and indicates which step is
interactive and which is automatic (interactive operations are
light gray). The figure also shows an option of taking a closely-
spaces sequence of images, if conditions allow, and increase the
level of automation. Here, we will discuss only the option of
widely separated views. Images are taken, all with the same
camera set up, from positions where the object is suitably
showing. There should be a reasonable distance, or baseline,
between the images. Several features appearing in multiple
images are interactively extracted, usually 12-15 per image. The
user points to a comer and labels it with a unique number and
the system will accurately extract the comer point. Harris
operator is used [Harris, 1998] for its simplicity and efficiency.
Image registration and 3D coordinate computation are based on
photogrammetric bundle adjustment for its accuracy, flexibility,
and effectiveness compared to other structure from motion
techniques [Triggs et al, 2000]. Advances in bundle adjustment
eliminated the need for control points or physically entering
initial approximate coordinates. Many other aspects required for
high accuracy such as camera calibration with full distortion
corrections have long been solved problems in Photogrammetry
and will not be discussed in this paper.
We now have all camera coordinates and orientations and the
3D coordinates of a set of initial points, all registered in the
same global coordinates system. The next interactive operation
is to divide the scene into connected segments to define the
surface topology. This is followed by an automatic comer
extractor, again the Harris operator, and matching procedure
across the images to add more points into each of the segmented
regions. The matching is constrained, within a segment, by the
epipolar condition and disparity range setup from the 3D
coordinates of the initial points. The bundle adjustment is
repeated with the newly added points to improve on previous
results and re-compute 3D coordinate of all points.
Imaging
Seed Points
Element Properties
Figure 2. General procedure for image-based modeling
An approach to obtain 3D coordinates from a single image is
essential to cope with occlusions and lack of features. Several
approaches are available [e.g. van den Heuvel, 1998, Liebowitz
et al, 1999]. Our approach uses several types of constraints for
surface shapes such as planes and quadrics, and surface
relations such as perpendicularity and symmetry. The equations
of some of the planes can be determined from seed points
previously measured. The equations of the remaining plane are
determined using the knowledge that they are either
perpendicular or parallel to the planes already determined. With
little effort, the equations of the main planes on the structure,
particularly those to which structural elements are attached, can
be computed.
1 Extract, match, and compute
3D coordinates of seed points
2. In 3D space reconstruct the
object from the seed points
/
/—T—’
Ær Column
x x
Window
3. Project new points into
the images
4 Model and texture map the object
Figure 3. Main steps of constructing architectural elements
semi-automatically (column and window examples)
From these equations and the known camera parameters for
each image, we can determine 3D coordinates of any point or
pixel from a single image even if there was no marking on the
surface. When some plane boundaries are not visible, they can
be computed by plane intersections. This can also be applied to
surfaces like quadrics or cylinders whose equations can be
computed from existing points. Other constraints, such as
symmetry and points with the same depth or same height are
also used. The general rule for adding points on structural