The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences. Voi. XXXVII. Part B3b. Beijing 2008
666
Figure 4. Foreground segmentation result with connected
components superimposed with best fitted ellipses.
To enable the acquisition of just one person in one image of the
PTZ-camera, only the category of the stand-alone persons is of
further interest. This category can be distinguished from the
other categories by superimposing the connected component
with a best-fitted ellipse. The ratio of the both semi axis and the
height of the component in object space are computed. If both
criteria fit predefined tolerance ranges, the existence of a stand
alone person is assumed.
The second approach is using classifiers for persons based on
AdaBoost (Viola & Jones, 2001). The principle of AdaBoost is,
to combine many weak classifiers, in this case based on
rectangular features, to form a strong classifier. On the basis of
a cascade of the weak classifiers, the algorithm is very fast. It
can be applied on the original image data as well as on the
segmented foreground data. In the later case a dilatation of the
segmented foreground is required to close gaps in the
segmented objects.
OpenCV offers already trained classifiers for the detection of
faces, upper and lower bodies and whole human bodies. The
whole body detector works quite well in non-complex scenes
and can also be applied for finding single persons. In case of
occlusions of some body parts, it failed. Thus the solely
employment of this detector is not sufficient for our tracking
purposes. In complex scenes the upper part of the body is far
less occluded, than the lower body part. Hence the use of the
upper body and the face detector is reasonable. The lack of the
face detector is, that it requires a larger scale of the pedestrian
(at least 30*30 pixels for the head), and that the person must be
frontal to the camera. Efficient pedestrian detection can be
achieved by using different detectors depending on the position
and orientation of the individuals within the image.
Pedestrian tracking
The challenge in tracking is finding the correct temporal
correspondences of detected objects in successive frames. This
is a minor problem, as there is just one object or multiple
distinct objects in the scene, but it becomes harder in case of
splitting and merging objects or in case of occlusions.
For tracking of the connected components we are applying the
kernel-based approach by Comaniciu et al. (2003), tracking of
the pedestrians detected with AdaBoost is performed using
Kalman-filtering. Since the test scenes have only few
occlusions, so far these approaches satisfy our demands. For
more complex scenes a deeper evaluation is in progress.
3.2 Computation of 3D-Positions
All computations so far were executed in image space of the
video sequence. For recording the trajectories and to calculate
the orientation of the PTZ-camera a transformation of the
positions of the detected individuals to object space is required.
In our case the area under observation is approximately a plane
surface. The transformation of coordinates from one plane to
another can be achieved by using projective transformation.
a 0 +a,x +a 2 y
(1)
c,x'+ c 2 y'+1
b 0 + bjX *-t- b 2 y '
(2)
C[X '+ c 2 y'+ 1
object coordinates
= image coordinates
ao,a 1 ,a 2 ,bo,b 1 ,b 2 ,c 1 ,C2 = transformation parameters
The eight transformation parameters are determined with at
least four planar control points, which should be placed in the
comers of the observation area. When using more than 4 control
points the parameters are determined by an adjustment.
Since we assume, that all persons are upright, the pixels of the
connected components belong to different height coordinates in
object space. To achieve the correct position of completely
visible persons the bottom pixel of the component must be
transformed.
Unfortunately in complex scenes the feet of a person often can
not be seen because of occlusions. In this case a mean height of
the person of 1.75 meters is assumed and the upper pixel is
transformed into a parallel plane.
3.3 Orientation of the PTZ-camera
The parameters of the interior orientation for the PTZ-camera
are determined by a testfield camera calibration. With this
method the calibrated focal length, principal point, radial and
tangential lens distortions were calculated for the later
correction of image distortions. The initial values for the
exterior orientation of the PTZ-camera are calculated by
resection in space. If more than three control points are visible
in the field of view the exterior orientation parameters are
determined by an adjustment.
To rotate the PTZ-camera to a point of interest detected by the
observation camera two rotation angles have to be calculated.
The first (a) lies in the horizontal plane and the second (P) in a
plane perpendicular to the ground plane. After consideration of
the current rotation angles, the optical axis of the PTZ-camera
points to the centre of the detected object. Then the object will
be imaged in the middle of the high resolution image of the
PTZ-camera.
In the event loop of the motion control following steps have to
be calculated with known initial values for the exterior
orientation and the calibrated focal length c of the PTZ-camera:
• Calculation of X,Y,Z coordinates of the object
location from x’,y’ image coordinates of the
observation camera with equation (1) and (2).