International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Vol XXXV, Part B5. Istanbul 200
| : 8 I [
implementation, the Sobol sequence has been selected. Each
point of this sequence becomes the centre of a small rectangular
window on the image. For each centre position, the pixels
inside the corresponding window are extracted and the
associated hue and saturation images are calculated. The
statistical distribution of the colours within the window is
characterized by a bidimensional histogram. The first
dimension of this histogram corresponds to the hue or the
saturation quantified on a discrete and finite number of
channels. The second dimension corresponds to the relative
proportion of each channel within the window. This
bidimensional histogram is computed and accumulated for each
point of the sequence, i.e. the current histogram is the sum of
the histograms at the current and at the previous position. From
this process, a compact descriptor or index is obtained.
This index provides an abstract description of the composition
of the image i.e. of the local distribution of colours throughout
the image. This is very important. This index does not
represent a global description of the image nor it is based on a
particular segmentation scheme. Instead, it characterized the
statistics of colour distribution within a small region that is
moved randomly over the image. Consequently, there are no
formal relations in between the different regions, which means
that the different components of a scene can be combines in
rarious ways while still be identified as the same scene. That is
why that algorithm is robust against occlusion, composition,
partial view and viewpoint. Nevertheless, this approach
provides a good level of discrimination.
As we know, an image is worth a thousand words, which means
that it is difficult to describe an image based solely on words.
For that reason, our retrieval approach is based on the so-called
“query by example" or “query by prototype" paradigm. In order
to initiate a query, the user provides an image or prototype to
the search engine. This prototype is described or indexed and
the later is compared with a metric to a database of pre-
calculated indexes, which correspond to the images of the
virtual collection. The search engine finds the most similar
images with respect to the prototype and displays them to the
user. The user then acts as an expert: he chooses the most
meaningful image from the results provided by a search engine
and reiterates the query process from the chosen image. The
process is repeated until convergence is achieved. A
demonstration can be found in [8] with a database of more than
2100 images.
3.2 Content-based retrieval of 3D artefacts
This section presents a new algorithm for the indexation and
retrieval of three-dimensional artefacts [9]. The indexation of
three-dimensional artefacts differs fundamentally from the
indexation of images. If the three-dimensional information has
been acquired accurately at a sufficiently high resolution, the
three-dimensional geometry constitutes an unambiguous body
of information in the sense that there is a one-to-one
correspondence in between the virtualised geometry and the
physical geometry of the artefact. As explained in the previous
section, the situation is entirely different for images.
Shape also constitutes a language of its own right. In addition
to verbal language, humanity has developed a common shape
language. This is particularly evident in fields like art and
architecture. For that reason, the “query by prototype”
approach is a powerful paradigm for the retrieval of similar
artefacts.
600
As far as the overall structural design is involved, the three-
dimensional artefact retrieval system is very similar to its image
counterpart: the artefacts of the collection are indexed offline
and a database of indexes is created. In order to interrogate this
database, the query is initiated with a prototype artefact. From
the proto-artefact, an index is calculated and compared with the
help of a metric to the indexes of the collection in order to
retrieve the most similar artefacts in terms of three-dimensional
shape. As stated before, the user can act as an expert in order to
reiterate the process until convergence.
Consequently, the main differentiation between the two systems
(image versus 3D) is the index. We now describe our algorithm
for three-dimensional artefact description. We assume that each
artefact has been modelled with a mesh. This is a non-
restrictive representation for virtualised artefact since most
acquisition systems generate such a representation. In the
present case, a triangular mesh representation is assumed. If the
mesh is not triangular, the mesh is tessellated accordingly. Our
objective is to define an index that describes an artefact from a
three-dimensional shape point of view and that is translation,
scale and rotation invariant. The later invariants are essential
because the artefact can have an arbitrary location and pose into
space.
The algorithm can be described as follows. The centre of mass
of the artefact is calculated and the coordinates of its vertices
are normalised relatively to the position of its centre of mass.
Then, the tensor of inertia of the artefact is calculated. This
tensor is a 3 x 3 matrix. In order to take into account the
tessellation in the computation of these quantities, we do not
utilise the vertices per se but the centres of mass of the
corresponding triangles; the so-called tri-centres. In all
subsequent calculations, the coordinates of each tri-centre are
weighted with the area of their corresponding triangle. The later
is being normalised by the total arca of the artefact, i.e. with the
sum of the area of all triangles. In this way, the calculation can
be made robust against tessellation, which means that the index
is not dependent on the method by which the artefact was
virtualised: a sine qua non condition for real world applications.
In order to achieve rotation invariance, the Eigen vectors of the
tensor of inertia are calculated. Once normalised, the unit
vectors define a unique reference frame, which is independent
on the pose and the scale of the corresponding artefact: the so-
called Eigen frame. The unit vectors are identified by their
corresponding Eigen values.
The descriptor is based on the concept of a cord. A cord is a
vector that originates from the centre of mass of the artefact and
that terminates on a given tri-centre. The coordinates of the
cords are calculated in the Eigen reference frame in cosine
coordinates. The cosine coordinates consist of two cosine
directions and a spherical radius. The cosine directions are
defined in relation with the two unit vectors associated with the
smallest Eigen values i.e. the direction along witch the artefact
presents the maximum spatial extension. In other words, the
cosine directions are the angles between the cords and the unit
vectors. The radius of the cords are normalised relatively to the
median distance in between the tri-centres and the centre of
mass in order to be scale invariant. It should be noticed that the
normalisation is not performed relatively to the maximum
distance in between the tri-centres and the centre of mass in
order to achieve robustness against outliers or extraordinary tri-
centres. From that point of view, the median is more efficient
that the average. The cords are also weighted in terms of the
are:
int
The
thre
of !
wit
the
vec
thir
sph
ens
the
mo
3.3
Th
col
sec
Th
car
arc
cla
Co
the
str:
CU]
WI
art
Orc
dis
sty
mi
Col
Fi
T
br
mi
kn
St
pe
sir
ch