International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XXXIX-B4, 2012
XXII ISPRS Congress, 25 August — 01 September 2012, Melbourne, Australia
identifier:
{ID,band}
| {band}
unaryArithmeticExpr:
(arithmeticUnaryOp arithmeticExpr)
binaryArithmeticExpr:
arithmeticExpr arithmeticBinaryOp arithmeticExpr
functionArithmeticExpr:
numericFunction (arithmeticExpr)
arithmeticExpr:
unaryArithmeticExpr
| binaryArithmeticExpr
| functionArithmeticExpr
| (arithmeticExpr)
| constantNumber
| castingExpr
| identifier
castingExpr:
rangeType(arithmeticExpr)
unaryBooleanExpr:
booleanUnaryOp booleanExpr
binaryBooleanExpr:
booleanExpr booleanBinaryOp booleanExpr
booleanExpr:
unaryBooleanExpr
| binaryBooleanExpr
| (booleanExpr)
| arithmeticExpr comparisonOp arithmeticExpr
The "identifier" in the expression refers to a raster layer of a
GeoRaster object. It is either a single band number if there is
only one GeoRaster object involved, or a pair of (ID, band)
where ID refers to one of GeoRaster objects in the expression
and band refers to a specific band of that GeoRaster object.
We also developed four major procedures including arithmetic
operation, conditional query, classify and cell value-based
update as follows.
sdo geor ra.rastermathop: runs arithmeticExpr operations.
sdo geor ra.findcells: searches cells based on booleanExpr.
sdo geor ra.classify: applies arithmeticExpr to cells and then
segments the raster.
sdo geor ra.rasterupdate: updates cells of a raster based
booleanExpr.
Each of these procedures take many layers from one or many
GeoRaster objects, apply booleanExpr and/or arithmeticExpr
expressions over those layers, do the specific algebraic
computation or modeling, and output a new GeoRaster object.
The expressions can be defined in any way based the syntax of
the expression language above.
4. PARALLEL PROCESSING
As we mentioned in the introduction, scalability and
performance of such systems are also keys to success. The
scalability of GeoRaster in the database has been mainly solved
by the design of the GeoRaster data model, the control of
memory usage in the GeoRaster engine, and the application of
Oracle GRID Computing technologies (Xie, 2006. Xie, 2008a.
Xie 2008b). This scalability applies to this in-database map
algebra as well. So, our focus here is mainly about performance
of the processing engine.
Performance depends upon the design and implementation of
the in-database processing strategy, the processing algorithms,
speed of I/O, flexible memory utilization, to name a few. Given
90
that modern computers are mostly multicore or have multiple
CPUs, parallel processing becomes a very important solution
for speedup. Parallel processing divides a large task into many
smaller tasks, and executes the smaller tasks concurrently in
different CPU’s or on several computing nodes. As a result, the
larger task completes more quickly. The major benefits of
parallelism are speedup (faster) and scaleup (more users) for
massive data processing operations. So it should be an essential
factor in our software implementation. Note that concurrency is
already part of the GeoRaster database, which can help improve
the speed of massive data processing too (Xie, 2006).
The Oracle database provides a powerful SQL parallel
execution engine that can run almost any SQL-based operation
— DDL, DML and queries — in the Oracle Database in parallel.
When you execute a SQL statement in the Oracle Database it is
decomposed into individual steps or row-sources, which are
identified as separate lines in an execution plan (Dijcks, 2010).
With this parallel execution framework, however, the
individual raster processing functions, such as mosaic and
raster algebra operations, cannot be directly parallelized
without some special implementation. This is because each of
the heavy image processing and raster manipulation operations
is not purely row-based and has its own logic in how the raster
data (or raster blocks) are internally processed.
There are several ways to leverage the oracle parallel execution
engine, among which pipelined and parallel table function is an
important aspect of parallelism. Table functions can be used
and controlled by any user. The goal of a set of table functions
is to build a parallel processing pipeline leveraging the parallel
processing framework in the database (Oracle 2008. Dijcks,
2010). We leverage table functions to encapsulate complex
logic in a PL/SQL construct so that we can process different
subsets of the data of a GeoRaster object in parallel. To
parallelize those operations we have to begin with explicitly
controlling the level of degree of parallelism and deciding what
subsets of the data to be handled in each subprocess. We used
the output raster to split the whole region into subsets and the
total number of subsets is decided by the degree of parallelism
(DOP), which can be controlled by user input. Then the Oracle
parallel execution framework will split the whole task into
different subprocesses based on the total number of subsets and
each subprocess will process one of the subsets independently.
When all subsets are finished, the whole process is done.
As an example, the following conditional query finds all pixels
in a three-band image where the cell value of the first band is
greater than 10 and less than 50, the cell value of the second
band is greater than or equal to 100 and less than 150, and the
cell value of the third band is greater than 200 and less than
245. The result is a new image of all pixels meeting the query
condition. The parameter 'parallel-4* means the process will be
parallelized into 4 processes, cach of which will process a
quarter of the original image simultaneously, thus the overall
performance will be improved significantly.
declare
geor SDO GEORASTER; -- source image
georl SDO GEORASTER; -- result image
begin
select georaster into geor from georaster table where georid = 1;
select georaster into georl from georaster table where georid — 2 for update;
sdo geor ra.findcells (
geor,
'(40}>10)&({0}<50)&{1}>=100)&({1}<150)&({2}>200)&({2}<245)',
null, georl, null, ‘false’,
er
Ta