> XXXIX-B4, 2012
RALLEL
gcospatial imagery and
growing exponentially,
ial GeoRaster, different
processing. This paper
completely inside the
his paper discusses the
ssing closer to the data
bottleneck of computer
based on PL/SQL and
, logical and relational
ster database. The third
his paper also presents
1 effectively help solve
ver. However, moving
' processing engine is
th limitations of the
entially, real-time and
such big data becomes
; a fast processing and
or databases is critical.
rprise database-centric
| data processing. This
nents of this database-
built completely inside
gine is raster algebra,
alytics. There are three
alytics engine. First, it
lata instead of moving
o implement this we
The third feature is the
such raster operations
discusses these key
ytics engine and the
ESSING
in-database analytics,
essing and analytical
warchouses. The basic
1g large data sets from
cessing and analytical
International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XXXIX-B4, 2012
XXII ISPRS Congress, 25 August — 01 September 2012, Melbourne, Australia
An in-database analytics. approach is much faster, more
efficient, and more secure than traditional analytics approaches.
In-database analytics delivers immediate performance,
scalability and security improvements because data never
leaves the database until results are filtered and processed (Das,
2010).
In-database processing is performed and promoted as a feature
by many of the major database and data warehousing vendors,
including Oracle, IBM, Teradata, Netezza, Greenplum and
Aster Data Systems (Grimes, 2008. Berger, 2009). For
example, Oracle Data Mining and Oracle R Enterprise are in-
database data analysis engines. Coupled with the power of
SQL, they eliminate data movement and duplication, maintain
security and minimize latency time from raw data to valuable
information.
In-database processing has been successfully used in many
high-throughput and mission-critical applications, including
fraud detection, pricing and margin analysis. The success of
this approach and its applications inspired us to consider the
same strategy for image and raster processing and analytics
inside Oracle Spatial GeoRaster.
As we mentioned in the introduction, geospatial imagery and
raster data are big data. A typical geoimage database has tens or
hundreds of terabytes of data. Petabytes of data is not abnormal.
Data has “weight” and geospatial image and raster data sets are
particularly “heavy”. Given that the processing and analysis are
data intensive, data locality should always be an important
factor in our design and implementation strategy. So we
conclude that building an in-database analytics engine should
be a good strategy. It moves the data processing closer to the
data instead of moving the data to the processing, which helps
achieve greater performance by overcoming the bottleneck of
computer networks.
3. THE MAP ALGEBRA LANGUAGE
Image and raster data processing and analysis involve a large
set of operations, including image geometric corrections, image
enhancement and classifications, map algebraic operations,
terrain analysis, geostatistics, to name a few. Since map algebra
is the basic and most commonly used technique in raster data
analysis and GIS modeling we mainly discuss its
implementation in this paper.
Developed through the 1980's by Professor C. Dana Tomlin as
part of his PhD thesis work, Map Algebra is a high-level
language providing a framework for performing raster data
analysis and cartographic modeling. Map Algebra includes a set
of operators, such as arithmetic, boolean, logical, relational, and
combinatorial operations. It also includes a set of functions,
which are generally classified into four categories: local, focal,
zonal and global (Tomlin, 1990).
There are many implementations of Map Algebra. However,
the exact syntax and workflow of the expressions and functions
could be very different among those implementations, while the
concepts and functionality remain the same. Generally, a
computing language should include declaration of variables and
constants, data processing operations (expressions) and
procedures (functions), statements and programs. We think the
same should be true for a good Map Algebra implementation.
PL/SQL, the Oracle procedural extension of SQL, is a portable,
high-performance transaction-processing language. PL/SQL
combines the data-manipulating power of SQL with the
processing power of procedural languages. You can control
program flow with statements like IF and LOOP. As with other
procedural programming languages, you can declare variables,
define procedures and functions, and trap runtime errors.
PL/SQL lets you break complex problems down into easily
understandable procedural code, and reuse this code across
multiple applications. When a problem can be solved through
plain SQL, you can issue SQL commands directly inside your
PL/SQL programs, without learning new APIs. PL/SQL's data
types correspond with SQL's column types, making it easy to
interchange PL/SQL variables with data inside a table (Oracle,
2012).
Oracle Spatial GeoRaster is completely built inside the
enterprise Oracle database server. The PL/SQL language is
available to GeoRaster already and the users are mainly using
PL/SQL to manage, query and manipulate GeoRaster objects.
So, we can further leverage the power of the PL/SQL language.
For our geospatial analysis purposes, what this language lacks
is the specific map algebra expressions and functions.
To implement this we designed a new raster algebra expression
language covering general arithmetic, casting, logical and
relational operators as shown below.
arithmeticBinaryOp:
4
|
F7
comparisonOp:
AMA
iV
arithmeticUnaryOp:
+
is
booleanBinaryOp:
&
||
booleanUnaryOp:
|
range Type:
castint
| castonebit
| casttwobit
| castfourbit
| casteightbit
numericFunction:
abs
| sqrt
| exp
| log
| In
| sin
| cos
| tan
| sinh
| cosh
| tanh
| arcsin
| arccos
| arctan
| ceil
| floor
ID:
integer number
constantNumber:
double number
band:
integer number
89