463
A database approach to very large LiDAR data management
LIU Hua * a , HUANG Zhengdong a , ZHAN Qingming 3 , LIN Peng a
a School of Urban Design / Research Center for Digital City, Wuhan University
Donghu Nanlu 8, Wuhan, China 430072
KEYWORDS: LiDAR data, Octree, Quadtree, Local KD tree, Display precision
ABSTRACT:
This paper presents an approach to realizing LiDAR data management in DBMS. We use octree to partition data space, and build a
local KD tree at each octree’s node. In data organization, we take precise control on the size of the KD tree’s nodes. To effectively
visualize point cloud data, the display precision is defined. The basic concept is to judge whether a node is displayed or not, by
computing the size of a node’s data range after projection. We also discusses the method of screen-buffer and makes KD traversing
from front to back, which can reduce the number of points for displaying and accelerate display speed. This method is particularly
suitable for very dense data or data far away from the viewpoint.
1. INTRODUCTION
Light Detection and Ranging (LiDAR) is a data acquisition
technique based on laser technology, which has been used
widely in recent years. Advantages of using LiDAR include the
following: LiDAR allows rapid generation of a large-scale
DTM (digital terrain model); LiDAR is daylight independent, is
relatively weather independent, and is extremely precise. In
addition, because LiDAR operates at much shorter wavelengths,
it has higher accuracy and resolution than microwave radar [1] .
Depending on specific conditions, the level precision of LiDAR
system ranges less than lm, and height precision ranges
between 15cm and 20cm [2] * * . After being processed, LiDAR data
may generate highly accurate digital Elevation model (DEM),
contour map and orthophoto map.
Because of the huge amount of data and also the limitation of
computer hardware, there has been no effective approach to
organize and manage the LiDAR data. The visualization of the
data has also not been satisfactorily resolved. There are some
algorithms for the display of large volumes of data, which
usually are based on the LOD of triangular net^. Common
method to solve the problem is writing the data into a compact
file, by means of which to speed up frustum clipping and
processing.
However, for discrete LiDAR points, traditional methods of
LOD and hidden surface removal [41 are no longer applicable.
Furthermore, it only considers view frustum clipping, but pays
no attention to the relationship between the data points, so LOD
and hidden surface removal are not appropriate. As a result, it
surely can’t gain a satisfactory result. In this paper, we use
database to organize data and establish the relationship between
them, and build local KD tree for data index. The algorithms of
display precision control are presented.
2. KD TREE AND IMPROVEMENT
In order to process and manage large volumes of LiDAR data,
an efficient data structure is very important. The KD tree may
provide suitable solution.
KD tree is a binary tree in a K-dimensional space. In the
traditional binary search tree, the data classification standard is
a key word which is usually a number which have a certain
attribute, such as the coordinate on the X axis. For the K-
dimensional data, only one key word is not enough to
effectively partition the multidimensional data. KD tree makes
the key word alterable, which defines key word based on each
node and the coordinate on each axis will play the role of key
word in tum [5] . A usually mode is to make the (N%K+l)th-
dimensional coordinate value as key word if the node is in the
N-level, that is:
node.split = node.lefelVoK +1
There are two ways to build a KD tree: one is direct insertion,
but as KD tree hasn’t the ability to balance, the form of tree
totally depends on the order of input; the other way is to
calculate the form of KD tree in good balance and get the order
to establish the balance tree. In this way, the tree would get
balanced at the cost of pre-calculation.
Compared with the quadtree and octree, the KD tree has several
advantages, such as balance, constructing based on data
partition, and no empty nodes. The disadvantage is that, for
some uniform distribution of data, the depth of the tree will be
deeper. That’s because each node of KD tree only has two sub
trees, while quadtree and octree have four and eight
respectively.
In traditional KD tree, each node has only one data point, which
means node must be split into two sub-nodes when it has more
than one point. This method is very wasteful both for query and
representation. So, a better standard which determines whether
the node should be split or not is necessary.
There are two basic methods to judge whether the node will
split or not: the method based on the number of points, and
based on the space.
The method based on the number of points is given a fixed
number N, which is defined as the largest number of points in
each node. If the number of point in a particular node is less
than or equal to N, the node needn’t to be split, otherwise