Close-range imaging, long-range vision

  
  
  
  
2. SPATIOTEMPORAL HELIX 
The spatiotemporal domain of a scene comprises two (x,y) 
spatial dimensions and one (t) temporal dimension. Object 
movement is identified by tracing objects in this 3-dimensional 
(x,y,t) space. We introduce the spatiotemporal helix (STH) is a 
compact description of an object’s spatiotemporal behavior. It 
comprises a central spine and annotated prongs (Fig. 1). More 
specifically: 
e The central spine models the spatiotemporal trajectory 
described by the center of the object as it moves over a 
temporal interval. 
e The protruding prongs express expansion or collapse of the 
object’s outline at a specific time instance. 
A 
t 
  
  
Figure 1: A spatiotemporal helix (left) and a detail showing the 
azimuth of a prong (right) 
Fig. 1 is a visualization of the concept of spatiotemporal helix. 
The spine is the vertical line connecting the nodes (marked as 
white circles), and the prongs are shown as arrows protruding 
from the spine, pointing away from or towards it. The gray blob 
at the base of the spine is the initial outline of the monitored 
object. The helix describes a movement of the object whereby 
the object’s center follows the spine, and the outline is modified 
by the amounts indicated by the prongs at the corresponding 
time instances. 
As a spatiotemporal trajectory, a spine is a sequence of (x,y,t) 
coordinates. It can be expressed in a concise manner as a 
sequence of spatiotemporal nodes S(’,..n"). The nodes 
correspond to breakpoints along this trajectory, namely points 
where the object accelerated/decelerated and/or changed its 
orientation. Accordingly, each node n' is modeled as n'(x,y,t,q), 
where: 
e  (x,y,t) are the spatiotemporal coordinates of the node, and 
e qisa qualifier classifying the node as an acceleration (q^), 
deceleration (q^), or rotation (q') node. 
Each prong is a model of the local expansion or collapse of the 
outline at a specific time instance, and is represented in Fig 1 by 
a horizontal arrow pointing away from (expansion) or towards 
(collapse) the spine. It is modeled as p'(t,r,aj,a;) where: 
e tis the corresponding temporal instance (intersection of the 
prong and the spine), 
e ris the magnitude of this outline modification, expressed as 
a percentage of the distance between the center of the object 
and the outline, with positive numbers expressing expansion 
(corresponding arrows pointing away from the spine) and 
negative numbers indicating collapse (arrows pointing 
towards the spine), 
e a; 8; is the range of azimuths where this modification 
occurs; with each azimuth measured as a left-handle angle 
from the North (y) axis. 
As shown in Fig 1 we can have more than one prong at a single 
instance, as it is possible for an object to be expanding in one 
direction while shrinking in another at the same time. While in 
general prongs correspond to small ranges over an outline, by 
properly assigning values to the azimuth parameters of a prong 
we can also model global expansion/collapse (a,=0, a,=360). 
Combined, spine and prongs comprise a concise signature of an 
object’s spatiotemporal behavior. They express external (spine) 
and internal (prongs) processes affecting an object’s position 
and shape, and allow efficient spatiotemporal modeling to 
support complex analysis. We have developed automated 
techniques to collect the information required to create 
spatiotemporal helixes. The generalization of point trajectories 
is accomplished using a variation of self-organized maps 
(SOM), a class of artificial neural networks. The creation of 
prongs is based on deformable object theory. In the next 
sections we will present a novel approach developed by our 
group to track variations of object outlines and movement, to 
provide the information necessary to produce a spatiotemporal 
helix. 
3. TRACKING SPINE MOVEMENT 
WITH SELF-ORGANIZING MAPS 
Trajectories can be perceived as paths in the spatio-temporal 
space. Accordingly, they resemble roads as they are depicted in 
2-d imagery. Self-organizing maps (SOM, [Kohonen, 1982]) 
provide a method to extract linear features from raster datasets. 
The SOM belongs to a class of artificial neural networks (ANN) 
characterized by unsupervised and competitive learning. Its 
unsupervised character is perceived through the automation of 
the procedure without any a priori human interaction on the 
input dataset. The input space in our case is the set of n point 
coordinates (x, y,t) of the center of mass of the phenomenon as it 
evolves through time. According to the SOM algorithm a set of 
neurons-nodes m<n are used to represent the input space. The 
procedure is based on competition between the set of nodes, 
which attempt to best map the points of the input space. The 
goal of competitive learning is to reward the node that optimally 
satisfies a similarity measure between a given input point 
compared against all nodes. 
The outcome of this analysis serves two purposes. First a 
concise and inclusive representation of the input space is 
performed for visualization purposes. Next, dynamic 
generalization is achieved through the variation of the solution 
variables, which yield more or less detailed description of the 
input space. More nodes provide more detailed representation 
while fewer nodes yield a more concise description. 
The basic SOM algorithm is summarized as follows [Kohonen, 
1982]: 
e Initialize the synaptic weight vectors W(n-1) for K nodes. 
e  Randomly draw an unseen sample X(n) from the input 
space. 
e Determine the winning node q using a similarity metric as 
in equation. 
e Update W for winners. 
—516—
1
2
...
529
530
531
532
533
...
640
641
Full text: Close-range imaging, long-range vision

Access restriction

Copyright

Note to user