2004
links
pairs
iding
his is
it the
sason,
object
ct, all
ed. Its
and
; part.
re 4).
art in a
cy! .of
S. been
1e in a
eatures,
listance
ca, their
ations
nce
listance
tance
Se
ince
stance
arity of
sponding
e.g. the
titute an
t class in
values of
reference
means of
International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Vol XXXV, Part B4. Istanbul 2004
acquisition, date of acquisition, etc., of corresponding
features
3.2.2 Deriving similarity measures: The result of an
integration of corresponding objects is the more significant and
useful, the clearer and the more reliable the similarity of the
features can be assessed. If good similarity measures are
available, then also the applications which are using the results
of an integration process, namely the conflation, analysis and
update of corresponding instances, can be optimized. In our
application, we need similarity measures in order to introduce
thresholds. These thresholds shall be used to figure out which
degree of similarity we actually need between instances if we
want to deduce information about correspondencies between
schemas.
A lot of attributes within a MultirepresentationalRelation object
can also be interpreted as indicators showing the similarity of
related representations, e.g. the geometric distance, the number
of adjacent features or the number of corresponding attributes,
etc. The task is now to figure out how one global similarity
measure (GSM) can be calculated from all the individual
similarity measures (ISM). In a first basic approach, we’re using
a weighted sum:
GSM =Y ISM; * weight,
i=0
As it has been proposed in (Walter and Fritsch 1999), a
statistical approach in order to exploit combinations of
measures could be applied as well.
3.2.3 Difficulties in instance matching: When a matching of
corresponding instances is performed, we can have simple, non-
ambiguous cases of cardinality /:/, I:n or n:m. But the process
can also involve severe difficulties: cases can occur in which
features of different object classes or with different attributes or
attribute values are taking part in a /:n or an n:m relation. Thus,
we have “pure” relations, but we can also have “impure”
relations (see figure 7).
/C ™
Class a
Q----------222-2l-.--2----- 0
1:n match:
Class n Class n Class n „pure“
N
Class b Class c
STEUER JS Sa 0 (Rly EAT I ER EL SI O
n:m match:
impure*
Class o Class p Class p nip
Figure 7. Impure and pure relations between instances.
Impure relations between corresponding representations can
constrain the usefulness of our approach since they provoke
ambiguities. For this reason, they have to be dealt with
appropriately when we infer the correlation between object
classes or attributes. Pure relations have to have more influence
than impure. Furthermore, measures to assess the degree of
impurity have to be found. This is part of our future work.
155
4. BUILDING AND ANALYZING RELATIONS
BETWEEN MULTIPLE REPRESENTATIONS
In the first phase of this research, a tool has been developed that
allows building up relations between multiple representations in
a semiautomatic way. Once the relations are created they can be
used to automatically derive similarity measures for the schemas
of the source data sets. This second step is still work in
progress, only some first results are available.
The whole software that has been implemented is integrated
into an open, Java-based software environment, which has been
developed by the Jump project (JUMP 2004). It consists of
three modules (in the Jump terminology, they are called plug-
ins): the Relation Builder module allows to build up relations
between corresponding instances (see figure 9), the Relation
Viewer module allows to display these relations and the
Relation Analyzer module allows to interpret the relations.
4.1 Building relations
The first step of our approach consists of generating the
relations between multiple representations stemming from
heterogeneous sources. Basically, it would be optimal to realize
this automatically. However, we are not focusing on the
automatic creation of relations, but we want to exploit the
relations in order to deduce information about schema
correspondencies. Therefore, we have realized a semiautomatic
approach, where an operator selects corresponding instances in
the map view. Involving a human operator can cause
inconsistencies, since two operators might interpret a spatial
scene differently. Thus, a catalogue of instructions on how to
deal with certain situations had to be set up in order to achieve
at least similar and comparable results. For example, a rule has
to be provided for the matching of street network data that says
if topologically separated objects can take part in a /:n or an
n:m relation. In our case, this is possible (see figure 8).
Cardinality: 2:4
=== Dataseta
Data set b
i Match
Figure 8. In our case, n:m relations can also be set up between
separated road segments.
Up to now, relations have been set up for a test area in the inner
city of Stuttgart, covering an area of approximately one square
kilometre. It contains street data of Geographic Data Files
(GDF) and the Authoritative Topographic Cartographic
Information System (ATKIS). GDF is mainly used for car
navigation purposes, whereas ATKIS is a topographic database
that was set up with the intention to provide spatial data for
different kinds of applications. Figure 9 shows a clipping of the
test scenario.