The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences. Vol. XXXVII. Part B4. Beijing 2008
201
CREATE TYPE sdogeoraster AS OBJECT (
rasterType NUMBER,
spatialExtent SDOGEOMETRY,
rasterDataTable VARCHAR2(32),
raster ID NUMBER,
metadata XMLType);
The other data type is called SDO RASTER, which is used to
define the Raster Data Tables (RDT). The actual raster cell data
of a large GeoRaster object are blocked into small blocks and
each block is stored as one row in its RDT. The relationship
between a GeoRaster object and its RDT and the raster cell data
inside the RDT table are maintained and managed internally
and automatically by enhancing the database server. Users only
need to understand and deal with the SDO GEORASTER
object type (Figure 1).
Figure 1. GeoRaster objects in an Oracle database (Oracle, 2004;
Xie, 2008)
Using standard SQL language, a user can define any table and
create one or more columns inside that table using the
SDO GEORASTER type. In Oracle database, a GeoRaster data
table is any user-defined table, which has at least one data
column of type SDO GEORASTER. From a user perspective, a
GeoRaster database is thus a list of GeoRaster tables, in which
each image or raster grid is stored as a GeoRaster object in one
row as shown in Figure 1. Users can build appropriate indexes
on various columns of the GeoRaster tables e.g., a spatial R-tree
index on the raster extent and B-tree indexes on other columns
so that queries and other operations on the tables can be
supported efficiently.
To build a GeoRaster database users simply create one or a list
of GeoRaster tables using standard SQL statement. One
example follows:
CREATE TABLE my table
(id NUMBER PRIMARY KEY,
name VARCHAR2(50),
my image SDO GEORASTER);
Users are required to create RDT tables. The reason is purely to
give users full control of the storage of the raster cell data so
that appropriate tuning and partitioning can be applied to
improve scalability and performance by leveraging the power of
Oracle database server.
As described earlier, the first challenge is to make sure the
image database management systems truly scalable. GeoRaster
is completely built inside Oracle database server and the
GeoRaster type is a native Oracle data type. Any table could
contain one or more columns of the GeoRaster type. There is no
limit on how many rows of an Oracle table could hold. A
GeoRaster table is just a regular oracle table and so it could
have unlimited number of rows. In each row a GeoRaster
column can store one image. So, there is virtually no limit on
the number of images you can store inside Oracle databases and
the total size of such image database could be in petabytes.
Another key question is how big a single GeoRaster object (a
single image inside the database) could be. For this we
specifically did some tests and described them in (Xie, Li and
Xu, 2006). We loaded the 50 DOQ images into the database and
stored each of them as one GeoRaster object in the Oracle
database. Then we mosaicked them into a single GeoRaster
object with a size of 9.6GB. We enlarged the mosaicked image
by using the GeoRaster procedure scaleCopy with
“scaleFactor=l 1” along both the row and column dimensions.
The size of the resulting image is 1.1616TB. Finally we
generated the pyramids for the result image using the GeoRaster
procedure generatePyramid. So we successfully ended up with a
huge GeoRaster object of about 1.5 terabytes in size. All
GeoRaster functions (the SQL API) passed tests on this huge
image.
This test shows that with a proper database configuration, this
approach allows creating, storing, and processing large raster
datasets. Single GeoRaster objects can be in the scale of
terabytes while the whole database can contain virtually
unlimited number of images with a total size in the scale of
petabytes.
4. THE SERVER-SIDE PROCESSING ENGINE AND
THE PERFORMANCE
A robust data manipulation engine is another essential part of
this approach and of any large-scale image database
management system. This engine should cover data loading,
exporting, insertions, updating, queries, deletions, analysis and
processing. Besides leveraging the standard enterprise database
features, GeoRaster provides over 100 raster data and metadata
operations through the SQL API to optimally manage and
manipulate GeoRaster objects in support of various application
requirements (Oracle, 2004; Xie, Sharma and Ihm, 2007; Xie,
2008). The data processing includes internal blocking and
interleaving format change, pyramiding, compressing,
mosaicking, enlarging and shrinking, subsetting, band copying
and merging, partial raster updates, and generating statistics.
The success of such a data processing engine relies on its
security, scalability, and performance. The database-centric
approach emphasizes a server-side image processing and raster
operation engine that means all the processing and
manipulations are implemented inside the Oracle database
server and use the protected database memory system. By doing
that the data no longer needs to be retrieved and loaded into a
middleware or client through an insecure network and
processed in an unmanaged computer memory. In other words,
better performance and true security can be achieved. The
-1.5TB GeoRaster object was generated through the engine, by
calling functions such as the loader, mosaic, enlarging and
pyramiding. All existing functions can run on this big image
and so demonstrated great scalability of this processing engine.
It’s achieved by adding to the database server a robust and
scalable caching and memory management system specifically
for GeoRaster object manipulation, no matter how big the
physical image is.