In: Stilla U, Rottensteiner F, Paparoditis N (Eds) CMRT09. IAPRS, Vol. XXXVIII, Part 3A/V4 — Paris, France, 3-4 September, 2009
177
suppression until one pixel each is left representing the car’s cen
ter. Fig.6 shows the regions left related to the cars that caused
them.
For bigger vehicles like trucks the same filter answers are used.
To recognize long edges without using new filters, the given an
swers of the side edges are shifted along the side of the car and
always conjuncted with each other.
To avoid cars being detected twice, all observations are tested
pairwise for their distances among each other. Some observations
have more than one maximum, or vehicles are detected twice be
tween two neighboring street segments. With respect to their size
and orientation, objects below a certain distance to each other are
discarded while only the one with the strongest intensity remains.
Figure 6: The detected cars
2.3 Tracking
As there are only short bursts of images, a classic Kalman filter
cannot really be used. As already mentioned Lenhart’s approach
in (Lenhart and Hinz, 2006) uses prediction for image triplets.
This works just in case there are triplets. Bursts with less than
three images, which appear as well, have to be handled differ
ent. That’s why we only consider relations between two con
secutive images. Scott and Longuet-Higgins suggest in (Scott
and Longuet-Higgins, 1991 ) a singular value decomposition as a
kind of one-to-one correspondence with respect to the positions
of all neighboring objects. This is more an association than a real
tracking as only the last image’s information is used. If / and
J are two images with m features /, and n features Jj we build
a proximity matrix G with the Gaussian-weighted distances Gij
between every feature /, and Jj.
Gri — e
. /2(7 2
(1)
where ty, = \\Ii — Jj|| is is the euclidean distance. So the ele
ments Gij decrease monotonically with the distance. The param
eter cr defines the degree of interaction between the features. A
small value enforces local and a big one rather global interaction.
It is recommended to choose a as large as the average expected
distance the feature pairs have.
The next step is to perform a singular value decomposition of the
proximity matrix G. The Algorithm is provided by a lot of soft
ware libraries. Here the one in OpenCV was used.
G = TDU J
(2)
After the S VD the matrices T and U are orthonormal matrices and
the diagonal matrix D TnXn contains the positive singular values
as diagonal elements in descending order. As the third and last
step a new matrix P has to be computed by
P = TEIT
(3)
where E is the changed diagonal matrix D with all elements re
placed by 1. The resulting matrix P has the same dimensions as
D but by the algorithm the values Pij for good pairings have been
amplified while those for bad ones have been reduced. So if Pij
is the greatest element in column and row the two features I\ and
Jj are in a 1:1 correspondence with one another.
Furthermore Pilu (Pilu, 1997) extends the algorithm for feature-
based stereo matching by using the cross correlation of two fea
tures next to their distance. So the SVD-association can be used
for images concerning the similarity of a certain window around
their features. Adding this (Gaussian-weighted) information to
the proximity matrix G the elements Gij result as follows:
-(Cy-l) a /27 2 /2<r
(4)
where the left term is the Gaussian-weighted function of the nor
malized correlation coefficient Cij between the features I t and
Jj. The parameter 7 determines how fast the values decrease
with Cij. During our tests the best values lie between 0.4 and
1.0.
3 RESULTS AND DISCUSSION
3.1 Detection
The computing time and the accuracy of detection always depend
on the number, size and quality of street segments given by the
database. In the first example (shown in fig.6) only a broad high
way in Munich has been tested without any smaller streets being
considered. The processing of the 28 mega pixels large image
took 30 seconds (Athlon 64 X2, 2.2 GHz, 2 GB RAM). The 96
vehicles were counted manually as ground truth and compared
with the detected vehicles. The varying detection rates caused
by varying thresholds are shown as the red graph in fig.7 and 8.
As one can see there is always a trade-off between completeness
and correctness. The more sensitive the thresholds are set the
more false positives they will find. The graph shows the detec
tion rate (number of true detected cars/real number of cars) in