Proceedings; XXI International Congress for Photogrammetry and Remote Sensing: Proceedings

chen, jun
Figure 4. A panoramic image mosaiced by UAV video 
image sequence 
3. MOTION DETECTION 
The compensation has reduced the impact of background 
motion, but there are still some influences of it remain in the 
stabilized image. Motion detection divides the video image into 
target and background whether it is moving or not. There are 
many processing methods introduced into motion detection, and 
the common point of them is the using of motion information. 
For static background, it usually processes on the background, 
such as background modeling method. For moving background, 
it assumes the dynamic image just has target and background 
two partitions, and if there are more than one target in the video, 
it will segment the image into numbers of partitions 
corresponding to the targets, and in some methods it sets the 
targets on different layers in order to make the process much 
faster. The primary information for detecting is motion 
information, or the intensity changes between adjacent video 
image frames. 
3.1 Motion Detection 
For video image captured by moving camera, the background 
motion can’t be counteracted absolutely through image 
stabilization. It may not effective enough to detect the moving 
target by restraining the movement of background. All the 
image information could be classified into three kinds: target, 
background and noise. Different classes correspond to different 
motion fields in dynamic image. If we know the class 
characteristics of points, we can use them to fit the parametric 
sets of different motion regions. Contrarily, if we know the 
parameters of motion vectors, we could divide the pixels into 
different fields according motion information. In most of cases, 
both of the characteristics and parameters are unknown. The 
clustering of image pixels is a probability question. A typical 
solution for motion classification is uniting the mixture 
probability model and EM—Expectation Maximum algorithm 
(Weiss et al., 1996). 
In practice, it can make a hypothesis that there are two layers in 
the dynamic image, background layer and target layer. After 
image stabilization, calculating the motion vectors of all pixels 
and assuming that the flow vectors of target layer is larger than 
the ones of background layer to estimate the weights of mixture 
model with iterated computation. It will have the target detected 
until the iteration convergence. The parameters of image 
registration could be the initial values of iteration. Figure 5 
presents a detection result for one vehicle target in three frames. 
Motion segmentation is a kind of video segmentation, because 
it partitions video or image sequence into spatio-temporal 
regions basing on motion information. Therefore, it is 
essentially same as the motion detection. Generally, motion 
segmentation has two basic classes that optical flow 
segmentation methods and direct methods (Bovik et al., 2005). 
In perfect cases, there are just two kinds of optical flow 
associated with the movements of background and target. 
However, optical flow is not an exact reflection of motion field 
but an explanation of illumination change. Therefore, it is not 
rigorous to perform the segmentation with the optical flow 
information only. 
A usually adoption is grouping in motion feature space to 
realize the segmentation. How to set the relation between 
clustering and dynamic image is another question. The method 
of graph theory is a natural solution for motion segmentation. 
Pixels in image sequence could be taken as the nodes of graph, 
and if we partition the graph, according motion features, may 
segment the image at the same time. Edge the weight means the 
similarity of features between the two nodes which connected 
by it. In motion segmentation, this similarity measurement is 
the motion feature vector of each pixel. The graph is not 
constructed in one image frame. It should connect all the nodes 
in a spatiotemporal region, and the region may across several 
frames. After the construction of the weighted graph, it could 
segment the video image sequence using by normalized cut 
method (Shi et al., 1998). In order to reduce the complication of 
computing, an effective solution is subsampling the image 
sequence by setting spatiotemporal window that just connect 
the nodes in this window when constructing the weighted graph. 
4. OBJECT TRACKING 
After detecting the location of target in image, object tracking 
will persistently lock the position of target during a period. The 
basic idea of object tracking is modeling the object according to 
object’s feature characteristic property and choosing 
appropriate tracking method. Different form motion detection 
emphasizing on accuracy, object tracking couldn’t abide taking 
too much time on computing and needs giving attention on both 
processing speed and precision, so it has to abstract the target 
through feature extraction and object modeling. Simply the 
features used could be shape, size, direction and velocity of the 
moving object, and complicatedly it could be feature points set, 
color space and so on. Combining with respective technical 
approach, it will realize the target tracking. The essence of 
object modeling is trying to define the target uniquely, and in
1
2
...
238
239
240
241
242
...
396
397
Full text: Proceedings (Part B3b-2)

Access restriction

Copyright

Note to user