d in colour is
contours are
o, number of
'ea basically,
it images are
w to find
a few pixels
Formation is
I projective
nation. The
aner surface
ailable when
as is enough
ndan, 1998).
incide GCPs,
because the
raction. We
n so that the
f techniques
recorded at a set of known locations by structuring error
covariance matrix. Cressie (Cressie, 1993) details Kriging.
3.2 Background Subtraction
Background subtraction technique is based on taking pixel-by-
pixel difference between the current frame and a background
image (Haritaoglu et al., 2000). A pixel wise median filter over
time is applied to several frames of sequential image to
construct the initial background image. The background image
cannot be expected to stay the same for long periods of time.
There could be illumination changes. Therefore, the
background updating has to be performed whenever
illumination changes. Kalman filter is utilized as the updating
A simple and common background subtraction uses absolute
difference of grey-value (Fathy and Siyal, 1995). The
difference of grey-value, however, is invalid when grey-value of
vehicle and background is similar. To deal with this problem,
we examined the effectiveness of distance in following colour
space (Sangwine and Horne, 1998).
(a) RGB colour space
(b) HSV colour space
(c) CIE colour space
Through application of above colour spaces to real images,
RGB colour space was adopted. Accordingly, the value of
background subtraction b as a feature is defined as follows:
(7, (x,y)- B, (x,y)) (1)
where /(x, y) is the current image and B(x, y) is a background
image, and the suffixes represent colours.
3.3 Shadow Detection
Shadows cast by vehicles cannot be separated from foreground
region by background subtraction fundamentally. The shadows
will affect results of spatio-temporal clustering, for example two
vehicles may be connected by neighbouring shadows. It is
necessary to distinguish between shadows and vehicles.
A shadow detection technique based on HSV colour space can
be applied to separation of shadows from the vehicles. The
algorithm is based on the comparison between the current frame
I(x, y) and a background image B(x, y) (Cucchiara et al., 2000):
1 ir les) er
B, (x.y
SP(x,y)= ^l (xy) - By (xp) zu Q)
^l, (x,y) = Be (x. y) St.
0 otherwise
where SP(x, y) is set to 1 if pixel (x, y) is classified as shadow, 0
otherwise, and eand the suffixes represent threshold and the
colour information, respectively. Equation (2) states that a pixel
(x, y) is classified as shadow if three properties hold:
(a) the ratio of the V (Value) component of current frame and
background image respects both a upper bound;
(b), (c) the differences of the H (Hue) and S (Saturation)
components are limited.
The rationale of the equation comes from the observation that
when an area is covered by a shadow, this often results in a
significant change in lightness without a great modification of
the colour information.
3.4 Optical Flow Extraction
Ideally optical flows of all pixels are computed. Extraction
methods of optical flow of all pixels can be divided into
(a) gradient-based approach
(b) area-based approach
We reviewed the gradient-based approaches theoretically and
compared their performance empirically from the point of view
of application to vehicle motion analysis (Fuse et al., 2000).
The accuracy of the optical flow obtained with this method was,
however, not up to the required level. We also examined the
comparison between above-mentioned approaches, and
confirmed area-based approach was more robust. Accordingly,
we adopt sequential similarity detection algorithm (SSDA):
R(x»)- Y:
1,5 (mn) -T(m,n) — min. (3)
where 7 is a template, M and N are size of the template. The
optical flow (u, v) acquired by SSDA is defined as another
3.5 Spatio-Temporal Clustering
With features f, that are value of background subtraction b and
optical flow (u, v), all pixels in a spatio-temporal image are
clustered. The spatio-temporal clustering means unifying pixels
which meet homogeneous property, namely similarity of
features. We adopt the weighted Euclidian distance measure as
similarity metric. To perform this operation, we simply compute
the distance d between adjacent pixels in the feature space:
dr f) w(b, —b,) EC -u,y * (v, -wy| (4)
where w is weight coefficients and R is reliability function to
estimated optical flow.
I. (mn) =T (m,n) (5)
1 M N
25543 22
Adjacent pixels in spatio-temporal domain are added to the
region as long as the region satisfies the desired homogeneity
property. The number of reference pixels in spatio-temporal
domain is 26. This procedure results in many regions in the
spatio-temporal image.
There may be the case that the pixels belonging to a same
vehicle are not adjacent in sptio-temporal domain. To deal with
such kind of situation, we developed probabilistic relaxation-
based approach (Fuse and Shimizu, 2000).
The result by only applying the clustering described above may
not be enough to recognize vehicles. The most characteristic