Systems for data processing, anaylsis and representation

allam, mosaad; plunkett, gordon
[S 
1formation 
)n systems. 
nodels, and 
rge objects, 
tial objects. 
presents a 
alyzing the 
ect Related 
>s. We then 
lel that uses 
s. We also 
pport large 
ness of the 
and further 
tion, GIS 
Video 
1ds 
  
pectrum 
ser 
  
  
1. INTRODUCTION 
In the past few years, a substantial development 
has been going on in the field of managing large 
spatial objects such as digital imagery, digital 
terrain model and scanned maps, mainly due to the 
interest of building multi-media spatial 
information systems and global environmental 
information systems. The development can be 
roughly classified into two categories. The first 
category is through system integration which uses 
two or more different systems, such as image 
processing system to handle image objects and Data 
Base Management System(DBMS) to handle non 
image objects such as text and graphics. Examples 
can be found in Chang[1990], Zhou Q.[1989], 
Wegener[1989], and Zhou[1991]]. The second 
category is through next generation data base 
systems using Abstract Data Type(ADT) or object 
oriented data model to handle large objects. These 
systems include Lohman[1989], Orenstein[1989], 
Deux [1990], Gupta[1992], and Stonebraker [1993]. 
However, there are no generally accepted solutions 
in GISs at this time. With the first method, two 
systems are loosely integrated. Large objects in the 
image processing systems are processed 
independently and the results are converted into 
the DBMS to perform GIS operations. This not only 
limits the use of DBMS for large object 
management, but also makes the data processing 
unnecessary complicated and time consuming 
because of multi data conversions. With the second 
method, all large objects are treated as long binary 
data strings with little semantics and data 
abstraction associated with them. This is not only 
inefficient for data processing because the whole 
data set may need to be read, written and processed 
together, but it also makes many kinds of 
interactive data processing impossible. 
In this paper we will analyze large object contents, 
highlight their special features, and develop an 
object oriented model to support them in GISs, using 
digital images as examples. We will also 
investigate their query patterns and present 
several methods that can be used to reduce the 
amount of data, to improve data retrieval 
efficiency, to speed up data query and to better 
Support browsing. We will then use several 
Practical GIS query examples to show the 
performance improvement upon using different 
techniques. We conclude the paper with some 
discussions on system performance and future 
research issues. 
2 . LARGE SPATIAL OBJECTS 
Large spatial objects are often represented by 
multi-dimensional matrices using long unstructured 
byte strings that are often stored and transmitted 
entirely. More precisely large objects consist of a 
list of small items and long data strings. The list of 
small items will be used to interpret the data 
format and meaning of the unformatted long data 
strings following it. For image data, these small 
items may be the image header; For Digital 
Terrein Model(DTM) data, these may be the name 
of the region, the coordinates of the origin, 
resolution, precision, etc. These small items are 
often mandatory and are used to interpret long data 
string for display and process, and/or to identify 
and distinguish one data string from others. We 
call these small items direct related attribute 
data(DRAD). While DRAD is indispensable, 
other formatted attribute data describing the 
contents and features of large spatial objects is 
generally not mandatory. For digital imagery, this 
data may be histogram, color map, and 
interpretation results from the original image 
data; For DTM, it may be the contour line, the 
slope and visibility data. This data is often the 
result of data calibration, interpretation, 
processing and analyses. We call this data derived 
attribute data(DAD). The DAD data is per se 
redundant because it is just another form of 
information presented in the source data. Usually 
DAD is very difficult and/or time consuming to 
derive. In GIS, it is desirable to store DAD in the 
database because DAD is high level information 
and can be used to answer most GIS queries. In 
addition we prefer to integrate DAD into GIS 
databases because DBMS then can be used to 
manage DAD. 
Because DAD and DRAD is much simpler than the 
raw data and sometime they may be well modeled 
by relational data model, many researchers use 
this technique to handle large objects. They store 
DAD and DRAD in RDMS, while manipulating 
the long string data through a link between the 
relational table and operating system files[Zhou, 
1991]. This approach may work for simple 
applications, but it has several serious drawbacks. 
First, the DAD and DRAD can be semantic rich and 
complicated (for example geometry and topology 
data) because of the amount of information 
embodied in a large object. Second, because of two 
data bases used are independent, it is very difficult 
to maintain data integrity and perform transaction 
management. Third, the relational data model 
lacks the power to define the semantics inherited 
in the large objects and the methods needed to 
process large objects. It is indeed not too much 
213
1
2
...
240
241
242
243
244
...
564
565
Full text: Systems for data processing, anaylsis and representation

Access restriction

Copyright

Note to user