The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences. Vol. XXXVII. Part B3b. Beijing 2008
612
single target tracking it only need to depend on one feature
property, but in multi-target tracking it may need a integration
of different kinds of features for directing at proper target, and
it also could using some suitable ways, such as filter methods
for multi-target.
4.1 Object Modeling
Object modelling is a representation of object, in other words it
utilizes one feature characteristic or the combination of features
to express the object. The object’s feature could be contour,
shape, color, position, texture, velocity and so forth. The more
features included, the easier to identify the object. But the
combining features will increase burden of processing and
demand composite methods. To construct the model of object,
we can use the features directly, or transform them into other
forms such as templates.
Features of the object may change during the course of
tracking, so it requires that the model should be adaptive to the
changing or other influences, for example occlusion and
unexpected movement. This is considered as the robustness of
model. There are many ways to make the model more stable,
including using multi-features model and updating the model
over time.
4.2 Object Tracking
Using prior information that forms the model of object, tracker
predicts the object’s position in succedent frames.
Corresponding to different models, object tracking has different
methods. Object tracking methods attempt to ascertain the
coherent relations of feature information between frames, and
the strategy of it is no more than searching and matching.
Hausdorff distance is a valid measurement for shape and texture
features of the object. It can create sparse point sets with feature
detectors in images, and the point set of image region labelled
as the object is the object’s model for Hausdorff measurement.
It is able to tackle the deformation of object, because it
describes the contour and texture of the object with bulk of
points. Taking the measurement and the model, it translates
object locating into the matching of point sets (Huttenlocher et
al„ 1993).
Motion is a kind of state. A typical motion state vector is
composed of the object’s position, velocity and acceleration
along each direction. If the prior and current states are known,
the posterior state will be predicted. It is feasible to resolve the
problem of object tracking by state estimation means. Kalman
filter is one of the state space methods. To define it, the
Kalman filter is a batch of mathematic equations that solves the
least-squares question recursively. It predicts the values of
current state utilizing the estimation values of former state and
the observation values of current state, executing the procedure
recurrently until the values of every state estimated. To get the
estimation values of each state, all the previous observation
values have been involved. For object tracking, the state
equation is the model of object in Kalman filter, and it describes
the transfer of states. The observation is the position of object,
and the state vector like mentioned above contains position,
velocity and acceleration. Putting the positions of object
detected in initial frames into the observation equation of
Kalman filter and taking the accurate positions as the initial
value of state variant, it compares the output of filtering with
precise result to testify the correctness of initial input. It repeats
the process until the filter is stable (Forsyth et al., 2003).
Mean-shift algorithm is an approach that searches the maximum
of probability density along its gradient direction, as well as an
effective method of statistical iteration. Object tracking with
Mean-shift algorithm is another class of technique that locates
the target by modeling and matching it. Both the modeling and
matching are performed in a feature space such as color space
and scale space. The mode of it is using the relevant similarity
measurement to search the best match. The object tracking
basing on Mean-shift algorithm mainly processes on the color
feature. Choosing an image region as the reference object
model, it will quantize the color feature space, and the bins of
the quantized space represent the classes of color feature. Each
pixel of the model can corresponds to a class and a bin in the
space, and the model can be described by its probability density
function in the feature space. Instead of PDF (probability
density function), it takes the kernel function as the similarity
function to conquer the lost of spatial information. Another
reason for using kernel function is smoothing the similarity
measurement to ensure the iteration converge to the optimized
solution during search (Comaniciu et al., 2003). An object
tracking result of airborne video using Mean-shift method is
shown in Figure 6.
Figure 6. An object tracking result of airborne video using
Mean-shift method
5. SYSTEM FRAMEWORK
To the technical approaches analysed above, it needs a
framework to integrate all these methods. For the technique of
moving target detection and tracking divided into three parts,
each part would be an isolated module for its independent
function in applicable system. Therefore, the processing is in
and between different modules. There are many systems
employ a series procedure. Compensation comes first, the next
is detection, and tracking put on the last. The reason of that is
anterior module always be taken as the precondition of
posterior module, and results of each one could be inputs of the
next one. However, this kind of system is not considering the
interactions between different modules. For example, the result
of segmentation can be the initial value of compensation, and
the tracking result can accelerate the detection processing.
As shown in the figure 7, distinguishing from traditional
technique framework, the presented system framework
introduces two more modules, which are data capture and
collaboration control. Data capture module gets the video image
data and samples it into image sequence, and then it will
distribute them to another three modules that are the central
parts of the system. The three modules implement a parallel
processing, and this will lower the cost of time. After the
interior computing, they transfer the outputs that always in the
manner of parameters to collaboration control module. The
control module manages all the other modules by sending
orders to them, and it provides interface to user and exterior
system.