International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Vol XXXV, Part B5. Istanbul 2004
In our experiments, we have avoided the use of textures due to
the simplicity of processing based in colour and the presence of
non-textured regions in views.
By discarding small regions below a threshold, and by using
path-connected constraints regions a topological map is
generated jointly with a symbolic representation given by a
graph. Qualitative kinematic information is obtained from
evaluating growing and decreasing phenomena of “homologue
regions” along a video sequence. We develop a coarse-to-fine
approach for kinematics evaluation in terms of cooperative-
competitive dynamical models. Competitive models work to a
microlocal (pixel) level, where small differences between
parameters (specific growth rates of populations and their
competition effects), are in the issue of relative advantages for
survivors. Cooperative models contribute to the regions
homogeneity from the local viewpoint. Along a video sequence,
mobile objects are in competence for the occupancy of regions;
thus, global behaviour is controlled by a competitive model
with three populations which are labelled as child, parents and
old. Old population concerns to the initialisation of each video
sequence. The transitions between populations of regions are
controlled at the intermediate parents level. Prediction concerns
to the child generation depending on critical values for the
allowed maximum size. The application of standard
morphological operators (erosion-dilatation) and their iteration
(opening-closing), simplity the identification and tracking of
evolving shapes, without extracting contours.
The relative linear or angular momentum of regions gives the
coarsest level for the dynamic model. The mass m; of each
homogenous region R; is represented by the number of pixels
with similar colour (modulus a threshold): a) Identify stable or
inertial regions as belonging to the background (relative
velocity under a threshold), b) evaluate nearness for mobile
regions labelled as nodes by using adjacency graphs ,c)
represent mutations (births, deaths) in terms of unfolding and
collapsing nodes of the graph d) evaluate relative velocities of
barycenters of regions with similar colour and localization.
Any kind of spatial propagation is based on first- or second
order differences of functions evaluated at pixels.
Unfortunately, first order differences are very sensitive w.r.t.
illumination changes and camera motions. To obtain stable,
robust and accurate results in the static case, we can use LOG
operators or typical Canny's operator to avoid the dependance
w.r.t. orientation and illumination. Spatio-temporal version of a
laplacian is given by a Laplace-Beltrami operator. However, the
high computational cost for mobile data and troubles for
dynamic grouping, suggests to introduce some kind of temporal
average at least for three consecutive images. So, temporal
average is responsible of small delays to generate meaningful
regions to be sampled, identified and tracked. Criteria for
temporal average are based on mediana filters for sampled
images.
Cost functions associated to regional segmentation arise from a
weighted balance between a) the error tolerance at low-level
and b) the maximum number of regions at high-level. Both
infinitesimal and local criteria require specific thresholds that
can be learned in a semiautomatic way depending on the data
concentration and the critical size of regions. The most accurate
results are obtained by using information arising from local
histogram comparisons. Thus, the selection of meaningful
thresholds can be performed from the beginning by using
directly a temporal average of two local histograms: a)
Threshold for error tolerance is based on the selection of local
maxima in histograms corresponding to the most frequent
values (medianas), and simple propagation mechanisms: If the
“distance value” is below the threshold, the pixel is assigned to
the current region, otherwise, a new region is created; b)
Threshold for maximum number of regions can be understood
as a mean average problem, which represents a variable version
of k-means problem, where k represents the maximum number
of regions, and each pair «position, colour» provides entries for
the algorithm. This technique allows to maintain constant the
costs of image processing. However, the variability of data
contained in video sequences makes difficult the selection of a
fixed value of k
If there is no need of a live processing, it is possible to
decompose each homogenous colour region in monotone or
convex parts, according to usual algorithms in Computational
Geometry [Ber97]. In this case, if optimal or at least more
accurate results are need, our approach is enough flexible to
support additional constraints. The symbolic management of
this finer decomposition is labelled as an unfolding of
centroids. More accurate results linked to boundaries are
obtained, but matching, indexing and contents retrieval become
more cumbersome. Thus, results will not be reported here.
4. AUTOMATIC GENERATION OF DISTANCE MAPS
AND OPTIMALITY
We start up with a description of an unsupervised clustering
technique that is based on a competitive propagation model
from centroids of homogeneous regions above a critical size.
An important issue is the integration of low- and high-level
image features. This integration is related with a coarse colour-
shape identification (for contents retrieval) and tracking of
mobile data in general spatio-temporal models. Corresponding
values at coordinates are weighted according to the distribution
of frequencies, to increase the relevance of colour homogeneity
or shape definition.
The simplest model needs to have in account centroid position
and typical colours of segmented regions, which gives us a 5-
dimensional parameter space. The introduction of separating
hyperplanes in this 5D space is performed in a very similar way
to a generalized Voronoi diagram, but following a recursive
pattern with successive subdivisions in half-spaces. So, we
obtain a coarse subdivision inside the 5D parameters space that
can be managed in terms of binary trees search. The spatio-
temporal implementation of this algorithm produces a distance-
map of contiguous regions with homogenous colour. Instead of
looking at adjustable weights, we can use a feedback between
colour-based grouping and shape with good properties for
search/optimization processes (monotone/convex subdivisions).
5. EXPERIMENTS
Two local factors are based in 1) the rate of the new regions
created per local area unit, and 2) the size of regions. Both local
factors are related between them by a hyperbolic law. So, if
many regions are created in a small area, then a complex
texture is found (wood or leaves in a typical outdoor scene,
e.g.), and the local error threshold must be increased.
Contrarily, if we find few regions in a local area unit, the local
error threshold can be lowered.
Intei
In a
our
obje