In: Wagner W„ Szekely, B. (eds.): ISPRS TC VII Symposium - 100 Years ISPRS, Vienna, Austria, July 5-7, 2010, ¡APRS, Vol. XXXVIII, Part 7B
472
Figure 3: Upper: Confidence image with detected blobs (red cir
cles) and final SVM detection (green crosses).
Lower: Original image with final detection.
shown in Fig. 2. Since the confidence image is normalized, grey
values range from -1 to 1 with zero crossings typically closuring
regions of pixels potentially belonging to vehicles.
As it can be seen in the confidence image, vehicle areas exhibit a
blob-like structure. Thus, in the second processing stage an inter
est point operator for blobs was implemented based on the work
of Lepetit and Fua (2006). The parameters of the algorithm have
been tuned for the used image resolution of 20 cm by 5-fold cross
validation. These parameters mostly reflect geometrical proper
ties of the vehicle clusters. Thus 80 % the non-vehicle areas can
be rejected from further processing. Nearly all remaining wrong
hypotheses are classified in the last processing stage. Therefore,
a number of statistical values are calculated from geometric and
radiometric properties of the remaining clusters in the confidence
image and in all channels of the RGB image. Due to the par
tially high correlation between those channels the total number
of more 100 statistical features is reduced by principal compo
nent analysis (PCA) transformation to the first 40 components
which contain 99 % of the descriptive information. This reduced
feature set is used to train a Support Vector Machine (SVM). The
slack variables and kernel type of the SVM are also optimized for
the specific resolution by cross validation leading to an average
False-Positive-Rate of approximately 12 %. As it will be shown 4
section, this accuracy is reflected in the correctness of the numeri
cal evaluation. Figure 3 shows the results of the interest point op
erator (marked by circles) and the final vehicle detection (marked
by crosses).
3.2 Vehicle Tracking
Vehicle tracking between two consecutive images of the burst is
done by template matching based on normalized cross correla
tion. At each position of a detected vehicle in the first image of
the image sequence a template image is created. Then, in the
second image of the sequence a search space for this vehicle is
generated. Its size and position depends on the position of the ve
hicle in the first image, driving direction obtained from NAVTEQ
road database, and the expected maximum speed for the road plus
a certain tolerance. Within that search space, the template is cor
related and the maximum correlation score is stored in connection
with the template position within the maximum appeared. This
normally represents the found match of each vehicle in generally.
The correlation is done in RGB-color space. Fig 4 shows a typical
result of the tracking algorithm obtained on the motorway A96
near Munich. Left image was taken 0.5 s before right image. The
dashed lines show corresponding matching pairs from normalized
cross correlation. Since all images are stored with their record
ing time, vehicle speed can directly be calculated from both the
Figure 4: Tracking of a group of cars on motorway A96 near Mu
nich. Corresponding matches are marked by dashed lines. Mind,
that the motorbike was not tracked, because it was not detected
(the classifier of detection was not trained to two-wheeled vehi
cles).
position of the vehicle detected in the first image and the position
of the corresponding match in the second image. Then, vehicle
tracking is applied to the following image pair of the sequence.
Several measures to chase mismatches are imlemented mainly
based on plausibility of velocity and driving direction, Constance
of velocity and driving direction within a burst, and plausibility
of individual speed and direction with respect to local average
values. Several potential mismatches as well as matches based
on false positive vehicle detection can be eliminated that way.
After traffic data extraction the results are immediately copied to
PC 5 (Fig. 1) and directly sent to the ground via S-band downlink.
There, data can be used in a traffic internet portal for road level
of service visualization and for traffic simulation.
3.3 Performance
Actuality of road traffic data is a general concern. For the use of
aerial recorded data in the traffic simulation an actuality of less
than five minutes is required. This means between exposure and
receiving traffic data on the ground a maximum delay of five min
utes is permitted. Flence, the processing chain must be optimized
for processing speed. If traffic data extraction is limited to main
roads, the bottle neck of the chain is produced by the orthorecti
fication process that takes 10 to 12 s for each nadir image. The
actuality criterion is fulfilled for the first bursts of each flight strip
easily, but a stack of unprocessed images is built up that leads to
a critical length of the flight path. If the critical length is ex
ceeded, the actuality criterion of the simulation will be overrun.
This critical length of the flight path can be estimated. Taking
into account a typical flight speed of 70 m/s, 3 images recorded
per burst, a break of 7 s between each burst, the critical length is
around 5 km. In case of full traffic data extraction in urban cores
the bottle neck moves to the traffic processor that slightly cannot
keep orthorectification performance (Fig. 5). Nevertheless, for
road level of service visualization, the performance of the present
processing chain is sufficient, since the hard actuality criterion of
the simulation does not apply in that case. However, the present
processing chain holds potential for improvement of calculation
time, even in the orthorectification process (section 5).
Scene
Suburban &
Motorways
Urban Core
Total
Images evaluated
73
6
79
True Positives
5545
2911
8456
False Positives
613
429
1042
False Negatives
424
317
741
Correctness
90%
87%
89%
Completeness
93%
90%
92%
Quality
84%
80%
83%
Table 2: Evaluation of vehicle detection quality.