Full text: Systems for data processing, anaylsis and representation

  
A 2-d search algorithm based on the Morton 
order is compared, in this paper, with a new 
algorithm based on row-major ordering, in 
terms of its simplicity and performance. 
Problems identified with the range search 
algorithm using Morton codes are discussed, 
and solutions are suggested to overcome these 
problems. As a result, an extended version of 
the 2-d range search algorithm to a multi- 
dimensional solution , using the layered 
approach, has been achieved. 
2. SUMMARY OF THE 2-D ALGORITHM 
BASED ON MORTON ORDER 
A detailed description of the range search 
algorithm based on Morton order has been 
given by Yang [1992] and Lee and Yang 
[1993]. The basic concept on which that 
algorithm was developed is that: over-searches 
are reduced by keeping the search within the 
query window. The moment a data point is 
found to be outside the window, the next entry 
point of the window is accessed by searching 
along the edges of the window. A data point 
immediately after that is called the Forward 
Point (FP). Some optimizations, based on the 
properties of Morton curve, are then applied, to 
further reduce the search over-head. The major 
advantage of this strategy is that it does not need 
additional data structures to support the search, 
and the search time depends on the query 
window with less influence from the size of the 
database. 
The performance of that algorithm was judged 
to largely depend on how fast FP could be 
found. The preprocessing required to create a 
sorted list of Morton codes is of O(NLogN), 
where N is the number of data points. The 
storage complexity is O(N). The processing 
time required to find an FP, and retrieve the data 
at or after FP, is O(KLogL + KLogN); where L 
is the average length of an edge in resolution 
units, and K is the number of times the Morton 
curve exits the window. On a fully-populated 
space, N > L, and the search over-head 1s 
bounded by KLogN. K depends on the size 
and location of the query window. The time it 
takes to retrieve I data points within a window 
of L by L units is O(I + KLogL + KLogN). 
3. PROBLEMS WITH RANGE SEARCH 
BASED ON MORTON ORDER 
98 
The following problems have been identified 
with the search algorithm based on Morton 
order: 
1. The time for finding a forward point (FP). 
This depends on the size and location of the 
query window within the application space. 
The recursion that generates the Morton codes is 
quadrant-based. Therefore, as the query 
window cuts across different quadrants, the 
processing of the forward points take on 
different characteristics. For example, for a 
given 2-d space as in Figure la, the query time 
for the same size of a window in location [A] is 
expected to be significantly lower than that in 
location [B] which cuts across higher levels of 
quadrants. 
2. Costly overhead for rotated or generally 
irregular window. The 2-d search algorithm 
works only for a query window which sides are 
orthogonal to the co-ordinate axes. This is 
because its efficiency depends on the properties 
of Morton order, namely: (a) the Morton codes 
on lines parallel to the axes of the space are 
sorted; and (b) for any rectangular window 
orthogonal to the axes, the south-west corner 
(SW) has the smallest Morton code, and the 
north-east corner (NE) has the greatest code. 
This constraint necessitates the forming of 
bounding rectangles over query windows which 
have irregular sides, or rectangular windows 
which are not orthogonal to the co-ordinate 
axes. 
As a result of this constraint, over-searches 
corresponding to the shaded area of Figure 1b 
arise. The accuracy of the search could be 
jeopardized, and any attempt to improve the 
accuracy may be at the expense of search time. 
3. Extension of the concept to multi- 
dimensional range search. It seems possible to 
extend the 2-d algorithm to involve domains of 
n-d spaces. In this case, the file is a collection 
of records, each of which is identified by an 
ordered n-tuple of keys {W1, W2, ..., Wn} 
which can be viewed as a point in an n-d 
Cartesian space. The Cartesian space 
formulation is the abstraction of a variety of 
important applications, often referred to as 
multikey search [Knuth, 1973; Yang, 1992]. In 
the Cartesian space, aspatial problems can bear 
geometric significance if the record attributes are 
regarded as co-ordinates and the n-values for 
each record as representing a point in an n-d co- 
ordinate system. 
One ca 
the n-d 
figure 
analogy 
the hyp 
the 2-d 
window 
involve 
possible 
These p 
range 
generat 
scheme 
Morton 
some ac 
concept 
4. THE 
To bene 
search a 
the n-d 
collectic 
consecu 
west co 
plane). 
projecti 
coding « 
Such a 
first lay: 
within tl 
north-e 
maximu 
South-w 
equal to 
plus 1.
	        
Waiting...

Note to user

Dear user,

In response to current developments in the web technology used by the Goobi viewer, the software no longer supports your browser.

Please use one of the following browsers to display this page correctly.

Thank you.