proposed data-aware scheduling algorithm is much more
efficient than the traditional FIFO method when a neighbor
requirement is present in the user’s processing algorithm.
6. CONCLUSION
Parallel computing has been increasingly used to solve data-
intensive problems in geospatial science. Inspired by these
problems, this paper proposed a universal parallel framework
for processing massive LiDAR point clouds in a HPC
environment. Within this framework, the user/programmer is
supported with a predefined Split-and-Merge programming
paradigm. In this paradigm, user/programmers can focus on the
simple functional expression of their specific algorithm into two
distinct programs, Split and Merge, and leave parallelization and
scheduling to the runtime system. This framework automatically
and intelligently handles key scheduling decisions for tasks and
data. For considering data sharing between task inputs, a
specific data-aware scheduling algorithm is proposed to
decrease the data communication time. One common LiDAR
algorithm, DT, was evaluated to prove the efficiency and
suitability of our proposed framework.
ACKNOWLEDGEMENTS
This work is supported by the Natural Science Foundation of
China (Grant: 40971211 and 40721001).
REFERENCES
Borkar, S. and Chien, A., 2011. The future of microprocessors.
Communications of the ACM, 54(5), pp.67-77.
Dean, J. and Ghemawat, S., 2008. Mapreduce: Simplified data
processing on large clusters. Communications of the ACM, 51(1),
pp. 107-113.
Guan, Q. and Clarke, K., 2010. A general-purpose parallel raster
processing programming library test application using a
geographic cellular automata model. International Journal of
Geographical Information Science, 24(5), pp.695-722.
Hawick, K., Coddington, P. and James, H., 2003. Distributed
Frameworks and Parallel Algorithms for Processing Large-Scale
Geographic Data. Parallel Computing, 29(10), pp.1297-1333.
Jonker, P., Olk, J., and Nicolescu, C., 2008. Distributed bucket
processing: A paradigm embedded in a framework for the
parallel processing of pixel sets. Parallel Computing, 34(12),
pp.735-746.
Staples, G., 2006. TORQUE resource manager. In: Proceedings
of the 2006 ACM/IEEE conference on Supercomputing. Tampa,
Florida.
Tehranian, S., Zhao, Y., Harvey, T., Swaroop, A. and Mckenzie,
K., 2006. A robust framework for real-time distributed
processing of satellite data. Journal of Parallel and Distributed
Computing, 66(3), pp.403-418.
Wang, H., Fu, X., et al., 2011. A common parallel computing
framework for modeling hydrological processes of river basins.
Parallel Computing, 37(6-7), pp. 302-315.
Wu, H., Guan, X. and Gong, J., 2011. ParaStream: A parallel
streaming Delaunay triangulation algorithm for LiDAR Points
on Multicore Architectures. Computers & Geosciences, 37(9),
pp.1355-1363.
206
KE
tec]
per
rem
ear
tem
for
unc
inte
inf
20€
inf
fea
pro
Im:
dev
inte
aut
inte
intc
stai
inte
and
bec
of
C
Em