The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences. Vol. XXXVII. Part B3b. Beijing 2008
601
Figure 3 shows the absolute and relevant execution times per
frame of the basic modules of the ATUs. The particular times
have been acquired by applying the non-parametric modelling
method, working under grid mode. The background extraction
module is the most crucial one as the computational cost of
such methods is typically large, causing problems for real-time
systems. Therefore, many experiments have been conducted in
order to provide both a qualitative and a computational
evaluation of these methods.
4. NETWORK COMMUNICATIONS
The final output of each ATU is a small set of parameters
(ground coordinates, classification, reliability), which is
transmitted to the SDF server through wired or wireless
transmission. If the foreground map fusion technique is used, a
greyscale image is provided at each polling cycle, indicating the
probability for each pixel to belong to the foreground.
All these data are transmitted through wired or wireless IP
connection to the server which performs observation fusion and
target tracking. TCP protocol is used for transmission of the
data from the ATUs to the central server whereas UDP is used
for remote controlling of the ATUs. As an indicator, the
bandwidth used per ATU when operating under the map fusion
mode with a frame rate of 3fps is about 192Kbps (3fps x
8Kbyte/frame).
The system requires frame synchronisation and constant frame
rate of all ATUs, which are achieved by using the Network
Time Protocol (NTP). The system’s clocks synchronise to the
central server’s clock and a appointment time technique (Litos,
2006) is implemented to ensure that frames from all cameras
are captured at the same instant despite network latency.
A secondary system based on media server software streams
video on demand to the central server in order to enable human
visual monitoring of the scene. As an alternative, compressed
motion JPEG images (JPEG 2000) can be used for streaming.
5. SENSOR DATA FUSION SERVER
The SDF Server collects information from all ATUs using a
constant polling cycle, produces fused estimates of the position
and velocity of each moving target, and tracks these targets
using a multi-target tracking algorithm. It also produces a
synthetic ground situation display (Figure 4), collects statistical
information about the moving targets and provides alerts when
specific situations (e.g. accidents) are detected.
Figure 4: SDF window with 3 targets on the airport APRON
5.1 Data fusion
A target present simultaneously in the field of view of multiple
cameras will result in multiple observations due to the fact that
the blob centres of the same object in two different cameras
correspond to close but different 3-D points. Two techniques
are proposed for grouping together all the observations that
correspond to the same target:
5.1.1. Grid-based fusion
A grid that separates the overlap area (in world coordinates) in
cells is defined. Optimal values for the cell size are determined
considering the application requirements (e.g. maximum
distance between vehicles). Each observation is assigned two
index values (j x J ) that indicate its position on the grid:
O'*0 y ) = (K - X s ]modc,[y w - y s ]modc) (2)
where X s , y s = world coordinates of the top left comer of the
overlap area
X w , y w = world coordinates of the camera level
observation
C = cell size
Observations belonging to the same cell or to neighbouring
cells are grouped together to a single fused observation.
To implement this technique the grid is expressed as a binary
image: cells that have at least one assigned observation are
represented by a white pixel, while those with no observations
are represented by a black pixel. A connected component
labelling algorithm is then used to identify blobs in this image,
each corresponding to a single moving target.
Fused observations are produced by averaging the parameters
of the observations that belong to each group. More specifically,
each fused observation consists of an estimated position of the
world coordinates, an uncertainty matrix as well as a
classification probability matrix.
The position and uncertainty matrices ( Z , R ) of the fused
observation are given by the following equations:
r=(£ r /v <3)
n=1
Z = Rf>;X
n=1
where Z w = the position (in world coordinates) of the n-th
observation in a group of N
R n = uncertainty matrix of the n-th observation in a
group of N
To calculate the average classification vector, the uncertainty of
each observation is taken into account. In this case the larger of