'stanbul 2004
gy for serial
on model. In
anges to each
ML files and
to kick of a
r difficulty in
‚ed easy third
1d machines.
b control and
platform (no
tomatic drive
' came across
since then in
OM)
> Component
uses remote
by allowing
the client. It
rectly over a
The theory
ved less than
ed evaluating
r to newer
03).
d be handled
ned to enable
Mability and
caling, while
'osoft, 2001).
omplicated to
lowever, it Is
arch did not
)plications. It
ent that could
Iso require a
/are price for
ut computing
roxies were
ocess. These
im each node
1e installation
> file servers.
be utilized by
) that provide
s Distributed
cations from
up that uses
e will finally
cts.
International Archives of the Photogrammetry, Remote Sensing
Submitting Machine
Figure 1. Distribution Model
3. SETUP
3.1 Requirements
The installation should be done on a network domain that has a
valid domain controller to avoid authentication problems when
mapping network drives. The different machines involved in
Condor HPC solution are:
Pool Manager: acts as the central Condor pool manager that
matches the jobs with the available machines. This is the most
important machine and shouldn't run jobs in a production
environment since it needs to be running all the time.
Submitting Machines: these are the machines that could
submit jobs to the cluster. Another option is to use one machine
as the gateway to the cluster and submit jobs only from that
machine. The latter has the advantage of localizing the different
run time option used to optimize the distribution.
Computation Nodes: these are the workers that do the
computation. These nodes could be dedicated rack mounted
machines or any machine in the office that is connected to the
network. Workstations could also be configured to be both
submitting and computation nodes.
File Server(s): these are machine that provide the disk storage.
The shared data files need to be on a Windows Server family
OS or use Samba from Unix machines since Windows 2000/XP
will only allow up to three clients to map its shared drives.
License Manager(s): machines used to provide the LGGM
floating licenses to be used by the submitting and
computational nodes. The opportunistic nature of GPro may
sometimes result in more machines being available at one time
than there are licenses. The Condor proxies of GPro correctly
handle this by resubmitting the job until it gets a license and
runs to completion.
Accordingly before installing a GPro Condor pool a user will
have to select the pool manager machine, machines that are
going to be used to submit GPro jobs (this are most of the time
operators machines) and computation nodes. It is possible to
combine the Pool/File/License Manager to the same machine if
it has the capability.
and Spatial Information Sciences, Vol XXXV, Part B3. Istanbul 2004
3.2 Setup
From a users point of view, the work required to set up the HPC
environment is minimal. Simply install Condor on the
workstations and change the program names in the GPro
preferences to their respective proxy equivalent. For example,
the rectifier — "ADSRectifier.exe" is changed to
"ADSRectifyCondorProxy.exe". According to their needs users
could mix and match which processes to distribute and which
processes to run locally. The following steps explain how to
install a HPC cluster for GPro using Condor.
Condor Setup:
l. Download Condor from
"httpz//www.cs.wisc.edu/condor/".
2. First install Condor on a server that is going to be the
Condor Pool Central Manager.
3. Install Condor on all the computation nodes.
* Dedicated computational nodes should be
configured so that their 'Job Start Policy' is
"Always Run Condor jobs". This defines a
very responsive pool.
e All other workstations, for example that
operators work on, should be configured
using the default start job after 15 minutes
of low CPU activity policy.
4. Test the condor installation using the examples
distributed with Condor.
GPro Setup:
There is nothing special about the Condor pool used by GPro.
The only assumption made by the installation is that all the
nodes have access to shared file servers.
I. Select a shared file system to be mapped by all the
computation nodes.
Install and configure the Leica licensing client on all
workstations that run and submit jobs to the pool.
3. Install the distributed version of GPro that sets up the
required proxy executables and configuration files to
submit jobs.
4. Modify the default Condor job submission files to
reflect the selected shared file system.
5. Change the program names to the proxies in the
general preferences of GPro on all the submitting
workstations
6. Start submitting jobs to the pool.
t2
For easy upgrades GPro should only be installed on the server
and not on each of the computational nodes. A typical
distributed run is shown in Figure 2 below.