he size
1sable,
]resses
hnol-
wild-
ration
hin the
fra-
eval,
ike
arch of
ch of
1S
e for
ention.
mean-
arch.
nenta-
exture
ed
et.
y prob-
pment
ication
rough
extent
itude,
e but it
d on
to
little
adata
Our content-based search, in contrast to straight metadata
search, is based on a set of algorithms which are implemented at
the time of the query and which provide the user with a set of
tools to search the data in a more flexible manner. The algo-
rithms are based on an examination of the pixel (or pixels) char-
acteristics rather than on pre-collected information. Thus we can
perform content search based on texture, shape, pattern recogni-
tion, or, even more simply, on classification algorithms which
are run at the time of the query.
For instance, the user may be interested in searching an image
archive for evidence of increasing urbanization as indicated by
new man made structures. It is possible, though unlikely, to indi-
cate in the metadata of each image that man made structures
have or have not been identified in that image. It is even more
unlikely that the metadata would include information on which
of these structures represented a change from some base period.
Such questions are hard to anticipate.
An alternative, content-based, search approach could utilize a
combination of pattern-matching and change detection algo-
rithms to determine areas exhibiting new square or rectangular
shaped features. Changing the query to select areas of *exten-
sive' change (a major change to the metadata list), an additional
aggregation algorithm might be employed.
As opposed to conventional databases, content search on image
libraries cannot be realized only through simple search of text
annotations. The problem is that image data are rich in detail and
itis difficult to provide automatic annotation of each image
without having either some form of human intervention or a set
of well defined models describing the domain in which queries
are to be run. Thus, content-based query of an image database
will require the following two steps:
+ Extraction of relevant features from each image through the
use of appropriate models.
* Determining whether the combination of features extracted
from the image represent the content for which the user is
searching.
The first step, namely extraction, can be done either at data
ingest or dynamically at run time. In the former case, the
extracted features are compiled into feature vectors for each of
the images and stored as metadata in a database and the content
search proceeds by searching through the stored vectors.
Extraction at ingest is frequently much more efficient than
extraction at run time as one can make use of multi-attribute
indexing techniques to search on the feature vectors defining the
images. However, this approach is not a panacea. First of all, the
feature extraction process is by its very nature lossy as the fea-
ture vector cannot represent all of the content contained within
the image. Furthermore, the processing required to derive the
feature vector can be quite expensive and at times impossible;
for example, the template matching scheme described below can
only be computed at run time unless the template is known a pri-
ori (an unlikely event). Finally, the indexing techniques used to
store the feature vectors tend to be application-specific and do
not typically scale well with a large number of pre-extracted fea-
tures. Thus, the feature-based databases that are being built
today tend to be tailored to a specific domain.
It is our thesis that, although useful and necessary, the use of a
predefined feature vector cannot adequately support content-
based search. It is necessary to provide the functionality that will
allow the user to visualize, define and extract features dynami-
cally thereby performing content-based search directly on the
image data.
Content Search Operations
Our system implements a set of image operators that can be used
as building blocks to synthesize higher level semantics specified
by the user. In this section we briefly describe the three opera-
tions that currently provide the bulk of our “general purpose”
content search mechanism.
One of the fundamental methods for detecting objects within an
image is template matching whereby a template of size n x n is
compared pixel by pixel with each n x n subimage. The objec-
tive is to find those regions having minimal difference from the
template. Typical applications of this kind of mechanism are
cross registration of two images for visualization and analysis
purposes and detection of a given scene from unregistered
images. Template matching is rarely exact as a result of image
noise, quantization effects and differences in the images them-
selves. Seasonal changes alone introduce effects that make the
matching process difficult. Thus, additional mechanisms are
required to ensure adequate search capabilities.
Texture is frequently used to describe two dimensional varia-
tions of an image with a characteristic repetitiveness and is a
good candidate for classification and feature recognition in a
subimage devoid of sharp edges. By using a taxonomy of texture
features or by providing examples, the user can define the infor-
mation of interest in the image.
Classification of a multispectral image is the process of labeling
individual pixels or larger areas of the image according to
classes defined by a specified taxonomy. This kind of classifica-
tion is typically used to generate land cover classifications. We
extend this approach by providing two additional extensions:
. We allow the user to dynamically define training classes
and perform the classification in real time. Thus, the user
can define classes not typically covered by standard classi-
fication techniques.
* We allow the user to assign information other than the spec-
tral bands. For example, by allowing the incorporation of
texture information into the training process the user can
define content that cannot otherwise be extracted from the
image.
We are in the process of incorporating other capabilities such as
shape (from segmented regions) analysis and specification of
spatial relationships into our system. These will be described in
a later paper.
Compression
While the price of storage devices continues to drop at a dra-
matic rate there is no doubt that the major cost of providing a
digital library will continue to be in the storage devices. Thus, a
reduction of even 30%, by the use of compression, in the storage
International Archives of Photogrammetry and Remote Sensing. Vol. XXXI, Part B2. Vienna 1996