HANDLING LARGE TERRAIN DATA IN GIS
Wanning Peng, Dragan Petrovic, Clayton Crawford
ESRI
380 New York St., Redlands, CA 92373, USA
wpeng(@esri.com, dpetrovic@esri.com, ccrawford@esri.com
KEY WORDS: Database, Large, DEM/DTM, Multiresolution, Generalization, LIDAR, GIS
ABSTRACT
This paper presents a research and development project that will provide an extension to 2D geo-databases for handling large terrain
data. It first discusses application requirements and system design, and then elaborates system architecture for optimal data
organization and updating, efficient multi-resolution queries, and dynamic DTM generation. It then addresses technical issues related
to data storage, seamless tiling, vertical indexing, and DTM generalization. Finally, it discusses the limitations and shortcomings of
the current approach, and identifies future research and developmea tasks.
1. INTRODUCTION
Many GIS projects, especially statewide and nationwide ones,
often need to store and manage large terrain data. Even small-
scale projects may have to deal with a large amount of terrain
data, due to newly available data acquisition techniques such as
LiDAR. Such data can be several tera-bytes in size, or may
contain billions of measurement points.
While most of today's enterprise geo-databases (such as SDE)
are capable of handling large 2D data, terrain data have brought
new requirements and challenges. These include 1) how to
integrate terrain data with 2D data, 2) what data structure to use,
and 3) how to support high performance multi-resolution spatial
queries and update.
Given the fact that TIN and GRID are the most popular data
formats in digital terrain modeling, it is necessary to examine if
they are the best choices for storing terrain data. Because
different applications may require data of different spatial
resolutions depending on underlying conceptual models (Peng,
2000, 1997), multi-resolution queries are becoming a more and
more important subject in GIS. Some applications may even
require a so-called “horizontal” multi-resolution query that
specifies different levels of vertical resolutions for different
parts of a study area (Kinder et al., 2000). Typical examples
include landscape planning and 3D flight simulation, where the
center of interest often requires higher resolution data, while the
rest ofthe area only requires data of coarser resolutions.
To address all these issues, and others, a new research and
development project has been implemented at ESRI to provide
an extension to current 2D geo-databases for handling large
terrain data. The rest of the paper elaborates the design concept
and system architecture, and addresses related technical issues.
Finally, it provides an outline for further research and .
development.
2. DESIGN CONSIDERATIONS AND SYSTEM
ARCHITECTURE
The design can be boiled down to three aspects: 1) what to
store; 2) where to store it; and 3) how to store it.
81
2.1 What to Store
Typically, source terrain data include 1) measurement points
(e.g., spot height points such as LiDAR data), 2) contours, and
3) structure lines (or break lines) that capture the discontinuity
of terrain and other important geomorphologic and geograpnic
features. Because a collection of individual points, contours,
and break lines, does not constitute a good (continuous) terrain
representation in a digital environment (Peng et al., 1996). they
are not usually directly used for surface visualization and
analysis in GIS. Instead, a typical GIS would build a digital
terrain model (DTM) using these data, and carry out analysis
based on the DTM. Because of this, people often store and
manipulate their terrain information directly as a DTM,
disregarding the source data.
A DTM may take the form of a GRID or TIN (triangulated
irregular network). Spatial resolution of a GRID DTM is
inherently constrained to cell size — the smaller the cell size, the
higher the resolution — apart from the quality of the original
data. However, once generated, the source data are lost and no
improvement is possible. One can only down-sample a GRID
DTM (i.e., go to a larger cell size and, thus, lower resolution).
Creating a new DTM of a smaller cell size out of an existing
GRID DTM will not increase its spatial resolution. A TIN
DTM, on the other hand, does not suffer from this constraint
due to its adaptive nature, although a small elevation tolerance
may be employed to reduce data quantity in constructing a TIN.
Many large data providers (USGS, for instance) choose GRID
for their terrain data, due to its simplicity and relatively small
storage size. TIN is typically used in places where engineering
precision is required. Because of its sophisticated structure and
heavy overhead in storage (in order to keep topology), TIN is
rarely used to provide and maintain a large amount of terrain
data.
Obviously, GRID is preferred if format simplicity and storage
space are the concerns. However, TIN might be a better choice
if high precision is desirable, especially when terrain skeleton
information (such as break lines and local extreme points), and
other structure lines are important to preserve. A hybrid system
that uses both GRID and TIN may sound like a good solution, if
only it does not increase the complexity and difficulty in data
management and updating, as well as in determining when to
use GRID and when to use TIN.