XXII ISPRS Congress 2012: Technical Commission IV

International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XXXIX-B4, 2012
XXII ISPRS Congress, 25 August — 01 September 2012, Melbourne, Australia
with GPGPU and photogrammetric using GPU parallel
processing.
General purpose parallel programming can use GPUs not only
for graphics but also for removing the burden of the non-
graphic computational workload which is traditionally handled
by a CPU. Significant computational speedups have been
achieved by various researchers from different disciplines using
general purpose parallel programming. Although GPU-based
non-graphics computation is well suited to data-parallel tasks
such as image processing kernels and matrix operations, it is
also possible to accelerate many other applications by adapting
existing algorithms to the general purpose parallel programming
(Yilmaz, 2010). Therefore it seems reasonable to exploit
tremendous computing power of GPUs for orthorectification,
since computational power is an important concern,
In this study we currently used the GPGPU method for
orthorectification procedure.
2. GPGPU AND STREAM PROCESSING
The main reason for coming to the agenda the GPUS is really
powerful and as well as cheap hardware available. These chips
were standard application equipment in near future but they
evolved into powerful and programmable processors to meet
general needs today. Especially in recent years, GPUs can be
used in general purpose calculations phenomenon attracted the
attentions of researchers dealing with complex problems which
need high level calculations. The biggest problem here is; GPUs
uses different programming algorithm. Because of that reason,
the effective GPU programming requires the re-writing existing
program algorithm using graphical terms again considering to
hardware structure and limitations. Today, the multicore
processors can not be programming using traditional
programming methods. So the usage of typical event
programming procedure can not be possible for programming
the multicore processors.
Programming model changed to stream computing and
processing. In this new model for identifying the kernel
functions, that apply intensive calculation each element in the
flow, all the input and output data qualified as stream. There are
lots of processors on the GPU that process these streams.
For example Nvidia GTX580 series card has 512 unit stream
processors (CUDA processors). So we can consider such as 512
computers stay side by side. The graphic cards can do multiple
intensive processes with these stream processors at the same
time.
GPUs can make more parallel calculation than CPUs. This can
be shown in Figure 1. *Flop" term defines the processor speed.
It means “the number of floating points per second". We can
see clearly from the chart, Nvidia graphic processors about ten
times faster than Intel processors in year 2010.
Memory bandwidth term means, amount of the data transferring
in per second between GPU and graphic card memory.
Theoretical maximum memory bandwidth is typically
computed by multiplying the width of the interface by the
frequency at which it transfers data. This term speedup is a
factor that improves the graphic card performance. Given the
floating-point operation per second to increase capacity in line
with the years, has been an improvement in memory bandwidth.
Figure 2 show that, GPUs bandwidth reached a rate 6 times
more than CPUs bandwidth.

Theoretical
GFLOP/s
1750
-HVIDIA GPU Single Precision
1500 ame NVIDIA GPU Double Precision
wegen [te CPU Single Prec sion
intet CPU Double Precision
1250
1000
750
Tesla C2056
500

250 Westmere
Tesla C1068
¥Woodores

: Harpertown
Mar-07 Jut -08
quem
Jun-04 Oct-05 Dec-09

0 PR
Sep-01^* nf

Figure 1. Development of floating-point operations per second
for the CPU and GPU (Nvidia, 2011).

Theoretical GB/s
200 ;

180 -

160 - eerie e eer esos em
140 -

120 -

100 -
60 mee
Westmere
40 +
; Bloomfield

Woodcrest
20 n Prescott m
go qp Harpertown
0 JusrHWOR ; Array
2003 2004 2005 2006 2007 2008 2009 2010

Figure 2. Development of memory bandwidth for the CPU and
GPU (Nvidia, 201 1a).
The reason behind the discrepancy in floating-point capability
between the CPU and the GPU is that the GPU is specialized
for compute-intensive, highly parallel computation — exactly
what graphics rendering is about — and therefore designed such
that more transistors are devoted to data processing rather than
data caching and flow control, as schematically illustrated by
Figure 3.

Control AU AU -— -
ALU ALU =

CPU GPU

Figure 3. The general structure of CPU and GPU and difference
in the number of transistors they have (Nvidia, 201 1b).
156

1
2
...
167
168
169
170
171
...
544
545

Full text: Technical Commission IV (B4)

Access restriction

Copyright

Note to user