2. SPATIOTEMPORAL HELIX
The spatiotemporal domain of a scene comprises two (x,y)
spatial dimensions and one (t) temporal dimension. Object
movement is identified by tracing objects in this 3-dimensional
(x,y,t) space. We introduce the spatiotemporal helix (STH) is a
compact description of an object’s spatiotemporal behavior. It
comprises a central spine and annotated prongs (Fig. 1). More
specifically:
e The central spine models the spatiotemporal trajectory
described by the center of the object as it moves over a
temporal interval.
e The protruding prongs express expansion or collapse of the
object’s outline at a specific time instance.
A
t
Figure 1: A spatiotemporal helix (left) and a detail showing the
azimuth of a prong (right)
Fig. 1 is a visualization of the concept of spatiotemporal helix.
The spine is the vertical line connecting the nodes (marked as
white circles), and the prongs are shown as arrows protruding
from the spine, pointing away from or towards it. The gray blob
at the base of the spine is the initial outline of the monitored
object. The helix describes a movement of the object whereby
the object’s center follows the spine, and the outline is modified
by the amounts indicated by the prongs at the corresponding
time instances.
As a spatiotemporal trajectory, a spine is a sequence of (x,y,t)
coordinates. It can be expressed in a concise manner as a
sequence of spatiotemporal nodes S(’,..n"). The nodes
correspond to breakpoints along this trajectory, namely points
where the object accelerated/decelerated and/or changed its
orientation. Accordingly, each node n' is modeled as n'(x,y,t,q),
where:
e (x,y,t) are the spatiotemporal coordinates of the node, and
e qisa qualifier classifying the node as an acceleration (q^),
deceleration (q^), or rotation (q') node.
Each prong is a model of the local expansion or collapse of the
outline at a specific time instance, and is represented in Fig 1 by
a horizontal arrow pointing away from (expansion) or towards
(collapse) the spine. It is modeled as p'(t,r,aj,a;) where:
e tis the corresponding temporal instance (intersection of the
prong and the spine),
e ris the magnitude of this outline modification, expressed as
a percentage of the distance between the center of the object
and the outline, with positive numbers expressing expansion
(corresponding arrows pointing away from the spine) and
negative numbers indicating collapse (arrows pointing
towards the spine),
e a; 8; is the range of azimuths where this modification
occurs; with each azimuth measured as a left-handle angle
from the North (y) axis.
As shown in Fig 1 we can have more than one prong at a single
instance, as it is possible for an object to be expanding in one
direction while shrinking in another at the same time. While in
general prongs correspond to small ranges over an outline, by
properly assigning values to the azimuth parameters of a prong
we can also model global expansion/collapse (a,=0, a,=360).
Combined, spine and prongs comprise a concise signature of an
object’s spatiotemporal behavior. They express external (spine)
and internal (prongs) processes affecting an object’s position
and shape, and allow efficient spatiotemporal modeling to
support complex analysis. We have developed automated
techniques to collect the information required to create
spatiotemporal helixes. The generalization of point trajectories
is accomplished using a variation of self-organized maps
(SOM), a class of artificial neural networks. The creation of
prongs is based on deformable object theory. In the next
sections we will present a novel approach developed by our
group to track variations of object outlines and movement, to
provide the information necessary to produce a spatiotemporal
helix.
3. TRACKING SPINE MOVEMENT
WITH SELF-ORGANIZING MAPS
Trajectories can be perceived as paths in the spatio-temporal
space. Accordingly, they resemble roads as they are depicted in
2-d imagery. Self-organizing maps (SOM, [Kohonen, 1982])
provide a method to extract linear features from raster datasets.
The SOM belongs to a class of artificial neural networks (ANN)
characterized by unsupervised and competitive learning. Its
unsupervised character is perceived through the automation of
the procedure without any a priori human interaction on the
input dataset. The input space in our case is the set of n point
coordinates (x, y,t) of the center of mass of the phenomenon as it
evolves through time. According to the SOM algorithm a set of
neurons-nodes m<n are used to represent the input space. The
procedure is based on competition between the set of nodes,
which attempt to best map the points of the input space. The
goal of competitive learning is to reward the node that optimally
satisfies a similarity measure between a given input point
compared against all nodes.
The outcome of this analysis serves two purposes. First a
concise and inclusive representation of the input space is
performed for visualization purposes. Next, dynamic
generalization is achieved through the variation of the solution
variables, which yield more or less detailed description of the
input space. More nodes provide more detailed representation
while fewer nodes yield a more concise description.
The basic SOM algorithm is summarized as follows [Kohonen,
1982]:
e Initialize the synaptic weight vectors W(n-1) for K nodes.
e Randomly draw an unseen sample X(n) from the input
space.
e Determine the winning node q using a similarity metric as
in equation.
e Update W for winners.
—516—