International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Vol XXXV, Part B5. Istanbul 2004
each stereomodel or to determine the 3D object coordinates of
all interest features in a MMVS.
Figure 1 shows our automatic point extraction and transfer
algorithm for DV images. The originally acquired DV signal is
first converted into sequential DV images by the free software
TMPGEnc available at http://www.tmpgenc.com. Now, the DV
images as shown in Figure 2 are interlaced. They must be de-
interlaced. Blurred de-interlaced images will be automatically
selected by simply using average image gradient. A de-
interlaced image will be tagged as a blurred image, if it has a
small average image gradient less than 10% of the average
gradient of five nonblurred de-interlaced images before it.
Blurred images will not be used. Then, feature points are
extracted by using the Fórstner operator (Fórstner, 1993)
because it can extract as clear and definite features as possible,
such as corner points. Those feature points "flowing" into a
homogeneous area will be deleted. The LK, NCC, and our
IOFE methods are then used for point tracking. In the final step,
tracking errors are detected by using the least squares
adjustment and correlation coefficient check.
DV image acquisition and preprocessing
DV image acquisition
Convert DV signal into sequential DV images
De-interlace DV images
Blurred image selection and deletion
Feature point extraction and selection
Feature point extraction by the FOrstner operator
Feature point selection
Feature point tracking
Point tracking by LK, IOFE, NCC
Error detection
Figure 1. Flowchart of data processing in the automatic point
extraction and transfer (APET) algorithm used in this paper
Figure 2. Interlaced DV images
This paper uses the LK method to build the optical flow vector
model, and utilizes the finite difference approach and the block
motion model to estimate the related gradients at a point for a
image function S(x,y,f) dependent on positional and time
variables x, y, and /. Detailed formulas can be found in (Tekalp,
1995). Thus, a displacement vector at a pixel P can be
computed typically from a template mask and a searching mask
of the same size with its centre at the (r,c)-th pixel in two
sequential DV images, respectively. Compared with the typical
LK method, our IOFE method changes this rule and involves
the following steps:
l. Compute the displacement vector (dr, dc) from a template
and a searching mask of mxm pixels (c.g. m=11) with its
centre at the (r,c)-th pixel P in two sequential DV images.
. The template mask remains the same. Move the searching
mask from (rc) to (r*dr, c*dc). Compute again a new
displacement vector (dr’, dc’).
3. If dr'z0 or dc'z0, repeat the step 2. Otherwise, stop the
computation at the pixel P.
N
Normally, the computation is completed after 2 or 3 iterations,
if the displacement vector length is less than 3 pixels. If the
number of iterations is larger than 10, stop the divergent
computation and label the pixel P as an “invalid point”.
Otherwise, label the pixel P as a “valid point”.
3. EXPERIMENTS AND ANALYSES
Figure 3 shows a DV image of near 2D objects on a wall. Their
sequential DV images are used as test images. Figure 4 shows
the histograms of displacement vector lengths at all valid points
for tracking from the 1* image to 25-7" image, respectively. It
illustrates clearly that the number of valid points (or trackable
points) is decreased, if the time interval of the aforementioned
image function S(x,v,/) is increased. The number of trackable
points is 32% at one image interval, and is continuously
reduced to 0% at the time interval of 8 images (from image 1 to
9). Nevertheless, a second top wave curve emerges in the
histogram curves (C)~(F). It means that a large number of
points with a displacement vector length of 18~38 pixels still
are trackable. Also, these histograms show that a large number
of points (78%~99%) are wrong tracked, since their
displacement vector lengths are less than the related image shift
distance. Therefore, a mechanism for error detection on the
tracking results is necessary. As shown in Figure 5, the LK
method determines a large number of points with shorter
displacement vectors than the real ones. Its registration
accuracy is 1 pixel, where the affine transformation is used as
the registration model. The IOFE method has a registration
accuracy of 0.511 and 0.415 pixel, respectively, if error-
deletion is not or is done. Figure 6 shows that the IOFE method
generates tracked point pairs with higher correlation than the
LK method. Table 1 shows the statistic figures of this set of test
images. It shows that the NCC method has the best registration
accuracy and provides most valid points, but is most time-
consuming. The same DV images are also used for the tests
with different mask size. The results show that the maximal
trackable range almost remains the same, although the mask
size is increased from 11x11 to 41x41. Figure 7 shows some
test results of a 3D scene. Visual check verifies that the IOFE
method provides better results than the LK method. Figure 8
and Table 2 show that both IOFE and NCC method can
efficiently track points for DV images of 60 fps (=frames per
second).