The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences. Vol. XXXVII. Part B3b. Beijing 2008
602
the two axes of the observation uncertainty ellipse is used to
specify a weight for the observation in the classification
averaging. Depending on the magnitude of this metric, a
corresponding weight is assigned to each observation. The
average classification vector is then calculated by:
N
Z«'/,
c —
C N
W.
/7=1
since a target may stop, move, accelerate, etc. Since only
position measurements are available, a simple four-state
(position and velocity along each axes) CV (constant velocity)
target motion model in which the target acceleration is
modelled as white noise provides satisfactory results.
Figure 5 shows the absolute and relevant values of the
execution times per frame of the SDF server modules. The
particular times have been acquired by working under the map
fusion mode. The data fusion module appears to be the most
time consuming one.
These parameters ( Z , R , C ) of each fused observation
comprise the input for the tracking unit.
5.1.2. Foreground map fusion
In this technique, each ATU provides the SDF server with one
greyscale image per polling cycle, indicating the probability for
each pixel to belong to the foreground. The SDF server fuses
these maps together by warping them to the ground plane and
multiplying them (Khan, 2006). The fused observations are then
generated from these fused maps using connected component
analysis and classification information is computed as in the
ATU’s blob classification module. Although this technique has
increased computational and network bandwidth requirements,
when compared to grid-based fusion, it can very robustly
resolve occlusions between multiple views.
5.2 Multiple Target Tracking
The tracking unit is based on the Multiple Hypothesis Tracking
(MHT) algorithm, which starts tentative tracks on all
observations and uses subsequent data to determine which of
these newly initiated tracks are valid. Specifically, MHT
(Blackman, 1999) is a deferred decision logic algorithm in
which alternative data association hypotheses are formed
whenever there are observation-to-track conflict situations.
Then, rather than combining these hypotheses, the hypotheses
are propagated in anticipation that subsequent data will resolve
the uncertainty. Generally, hypotheses are collections of
compatible tracks. Tracks are defined to be incompatible if they
share one or more common observation. MHT is a statistical
data association algorithm that integrates the capabilities of:
• Track Initiation: Automatic creation of new tracks
as new targets are detected.
• Track Termination: Automatic termination of a track
when the target is no longer visible for an extended
period of time.
• Track Continuation: Continuation of a track over
several frames in the absence of observations.
• Explicit Modelling of Spurious Observations
• Explicit Modelling of Uniqueness Constraints: An
observation may only be assigned to a single track at
each polling cycle and vice-versa.
Specifically, the tracking unit was based on a fast
implementation of the MHT algorithm (Cox, 1996). A 2-D
Kalman filter was used to track each target and additional
gating computations are performed to discard observation -
track pairs. More specifically, a “gate” region is defined around
each target at each frame and only observations falling within
this region are possible candidates to update the specific track.
The accurate modelling of the target motion is very difficult,
Figure 5: Execution times of SDF modules
6. EXPERIMENTAL RESULTS
In this section, experimental results that concern the most
crucial and computationally expensive modules of the ATU and
SDF software are presented and discussed. These modules have
been identified as the background extraction module for the
ATUs and the data fusion module for the SDF server.
For the purposes of deciding on the most appropriate
background extraction technique for the specific applications,
tests have been run on various sequences. The masks shown on
Figure 6 are obtained from the prototype system installation at
“Macedonia” airport in Thessaloniki. Figure 6 (a) shows the
original image, while in Figure 6 (b) the three moving objects
that need to be detected by the background extraction methods
are marked with red circles. As seen in Figure 6 (c) the objects
are detected with the mixture of Gaussians method, although
the shape of the masks is distorted due to shadows. The results
of the Bayes algorithm are shown in Figure 6 (d). This method
fails to detect slowly moving objects like the one on the left of
the image. The Lluis et al method shown in Figure 6 (e)
produces masks with low level of connectivity, which are not
suitable for the following image processing steps. Finally the
non-parametric modelling method (in Figure 6 (f)) yields very
accurate results, while coping well with shadows, as it
incorporates an additional post processing step of shadow
removal.
Another crucial issue when deciding on the most appropriate
background extraction algorithm is its execution time. To
evaluate the computation complexity, all four methods were
applied on three sequences of different resolutions (320x740px,
640x480px, 768x576px). The execution times per frame for
each of the four methods and three sequences are presented on
Figure 7. An Intel Pentium 4 3.2GHz with 1GB of RAM
running on Windows XP Pro was used and all algorithms were
implemented in C++ using the open source library OpenCV.
Taking into consideration both the qualitative results and the
computational complexity of background extraction methods,
the non-parametric modelling emerges as the one having the
best trade-off between results quality and execution times.
Data fusion 70%
(31.70ms)
Tracker 1% Display 29%
(0.22 ms) (13.25ms)