ISPRS Workshop on Service and Application of Spatial Data Infrastructure, XXXVI (4/W6), Oct.14-16, Hangzhou, China
and hierarchically defined. Transparent access to heterogeneous
hardware and software operating systems is also guaranteed.
2. VIRTUAL ENVIRONMENTS
2.1 Emulation vs. Simulation
There are currently several projects aiming at providing users
and developers with virtual computing environments. There
exist two complementary and dual approaches, depending on
the way existing computing resources are used.
One approach is to emulate complex computing infrastructures
on ad-hoc software. The second approach is to simulate simple
environments running on complex infrastructures.
The first approach tends to virtualize complex environments
running on simpler infrastructures: XEN [Barham, 2003] and
User-Mode Linux UML [UML] are examples of such projects.
Also, VMware, a “virtual infrastructure software”, is a
commercial product in this class [VMware],
Another example is the XEN virtual machine monitor which
“uses virtualization to present the illusion of (running) many
smaller virtual machines, each running a separate operating
system instance” (Fugure 1). This is referenced as the “emulated
virtualization” in (XEN White Paper) and dubbed “para-
virtualization” in (Barham;, 2003). To some extent the Linux-
VServer [Linvserv] private virtual servers that focus on
isolation and security for private user spaces is a similar
approach. This is what we simply call here the emulation
approach.
Figure 1. The virtualization approach.
The second approach tends to simplify the users view of
complex environments: OpenSSI [Walker, 1999], Vgrid (vgrid,
2003) and Kerrighed [Kerri] are such examples. They provide
single systems images (SSI) to simulate single computing
environments running on a set of underlying systems which are
connected together (Figure 2). This is what we call here a
simulation approach.
Emulation lends itself nicely to secure and multiple isolated
instances of (possibly heterogeneous) systems running
concurrently on the same underlying infrastructure. It provides
complex environments suited to the application needs, at the
price of possibly lower performance. But theoretically, any
complex system can be designed using this emulation approach.
Simulation in contrast does not provide superior functionalities
with respect to the underlying infrastructure. Its main goal is to
mask the complexity of the underlying environments. It is
basically made of multiple instances of (Linux) operating
systems and computing resources (files, servers, etc) and
provides a single interface to the users. It thus simplifies for the
end-user access, logging, and automating execution, load
balancing, failure recovery (by component substitution) and so
forth. Simulation here provides superior functionalities and
simpler interfaces to the users.
Figure 2. The Single System Image.
Both approaches can be considered as virtualization approaches,
although they are very different in their goals and deployment,
because in both cases, the users are ultimately made unaware of
the underlying computing infrastructure.
A direct benefit from both approaches is that various tasks can
be automated (load-balancing, task relocation), made
transparent (remote access to files). Further, the virtualization
approach improves interoperability by using dedicated
application environments, usability, security (task isolation, file
protection) and performance (dynamic allocation of processors
to threads).
Another side-effect is that the underlying hardware and software
environments are masked to the users. Consequently, various
(heterogeneous) computing resources can be used, and their
location is ultimately unknown. This clearly improves
extensibility and scalability by masking the underlying
infrastructures as well as adaptability (infrastructure changes are
made transparent to the applications).
Access to resources connected to a local high-speed network is
a de facto goal for simulation environments, which clearly aim
the cluster-computing arena (using for example cluster-wide file
access, TCI/IP, single cluster-wide naming, etc).
Access to local or wide-area grids can be seamlessly hooked to
the simulated environments because dedicated computers can be
connected which are in charge of the communication with the
networks.
This is where the computing grids step in. Note that they are not
strictly required in our approach. However, the fact that they
have long been advertised as providers of huge raw computing
power cannot be ignored. They provide here the power to
emulate the necessary infrastructures required by the complex
environments supporting the applications being designed:
spatial data infrastructures.