It is not a trivial problem to determine all of the funda-
mental visual events that may possibly occur for any object
in the class, and to derive the partition of viewpoint space
from the particular set of visual events that occur for a given
object. The intricacies are such that most implementations
have only recently been completed. A simpler, and perhaps
more commonly applied approach is not to compute an exact
aspect graph at all, but rather an approximation based on a
quasi-uniform sampling of viewpoint space.
For the viewing sphere model, the approximation approach
typically begins with an icosahedron centered around the ori-
gin of the coordinate system, so that the vertices of the icosa-
hedron lie on the surface of the sphere. An icosahedron is
the regular solid having the greatest number of faces, there
being 20 congruent equilateral triangles.
By treating the center point on each face of the icosahedron
as defining a viewing direction, the icosahedron provides a
uniform sampling of 20 points on the sphere. Since 20 view-
ing directions is generally not sufficient to capture the full
visual complexity of an object, each face of the icosahedron is
typically subdivided some number of times to provide a finer,
quasi-uniform sampling of the viewing sphere. The subdivi-
sion is done by connecting the midpoints of the three edges
of the current triangular face to one another to subdivide it
into four triangles. The three new vertices are “pushed out”
to the surface of the sphere, and the center of each new tri-
angle defines a new viewing direction. The eventual size of
the tessellation of the sphere varies, with some researchers
using more than 5000 faces.
For each sample point on the sphere, a set of visible ob-
ject features is calculated. Neighboring triangles having the
same feature description are merged together to represent
one node in the aspect graph. The set of features found for
these views is often different from an image structure graph,
since they need not be well suited for exact visual event
boundary analysis. The above approach has been employed
by a large number of researchers (Burns and Kitchen, 1987;
Camps, Shapiro and Haralick, 1991; Shapiro, 1988; Chen and
Kak, 1989; Dickinson, Pentland and Rosenfeld, 1992; Fekete
and Davis, 1984; Goad, 1983; Hansen and Henderson, 1989;
Hebert and Kanade, 1985; Ikeuchi, 1987; Hutchinson and
Kak, 1989; Korn and Dyer, 1987; Raja and Jain, 1992) and
is now a standard technique in computer vision. A similar
technique has been applied to 3-D viewpoint space (Wang
and Freeman, 1990).
A main advantage of the technique is that it is relatively
easy to apply to a rigid object of any shape. (It can also
be useful for investigating the behavior of nonrigid objects,
as will be discussed later.) One disadvantage is that it is
difficult to know a priori what the appropriate resolution
for the viewing sphere should be for an arbitrary object. If
the resolution is not fine enough, some views of the object
may not be included in the representation. If it is too fine,
unnecessary work will be done in creating the representation.
4. EXACT ASPECT GRAPHS FOR POLYHEDRA
Initial research on the automatic construction of exact as-
pect graphs began with the simplest of objects, polygons
(Gualtieri, Baugher and Werman, 1989; Watts, 1988). This
advanced to 2.5-D polyhedra (Castore, 1984; Castore and
Crawford, 1984), which are formed by sweeping a polygon
along a straight axis. Next came 3-D convex polyhedra
(Plantinga and Dyer, 1986, 1987, 1990; Watts, 1988; Stew-
man and Bowyer, 1990) and eventually general polyhedra
were analyzed by several groups of researchers (Gigus and
Malik, 1990; Gigus, Canny and Seidel, 1991; Plantinga and
Dyer, 1987, 1990; Seales and Dyer, 1990; Stewman, 1991;
Stewman and Bowyer, 1988).
The general algorithm for automatically computing the as-
pect graph of an object has the following steps:
1. Each class of objects has a fundamental set of ways in
which accidental viewpoints may occur for any object
in the class, and a catalog of these can be developed.
From this catalog, specific events for any particular
object must be enumerated in some manner.
2. Each visual event represents a hyperplane in viewpoint
space that is a boundary between general views. The
partition of viewpoint space into cells of general view-
point is computed from the set of hyperplanes repre-
senting the visual events.
3. Finally, the aspect graph itself is obtained by travers-
ing the partition of viewpoint space. This may require
merging certain regions of viewpoint space into one,
depending on the algorithm used in step 2. Represen-
tative views of each aspect are usually calculated for a
central point in the cell.
4.1 Visual Events
For polygons, visual events are marked by the alignment
of two vertices, causing an edge to be either partially or
totally occluded from view. The event boundary is merely
the line joining the two aligning vertices. Figure 3 shows the
subdivision of the plane into areas where different aspects
are seen for a C-shaped polygon. Notice that the event lines
are not meaningful everywhere. For general polyhedra, the
two fundamental types of accidental views are edge/vertex
alignments, and edge triplet alignments. An edge/ vertex
alignment occurs when a vertex and some point on an edge of
the object project to the same point in the image. In general,
this event represents the beginning or ending of occlusion,
: GFCB
BA. B + GFECB
C B crc
5 Ci . crEC
A |A E| FED: GFEDC
F E GEDC
H «— GDC
: 1% HGEDC
AH : H " %
ı * HGDC*,
Figure 3. Parcellation of viewpoint space for C-shaped polygon.
Normal dashed lines mark visibility limits of polygon edges. Bold
dashed lines mark occlusion limits for two edges. Regions (aspects)
are labeled with visible edges.