Full text: Proceedings, XXth congress (Part 4)

International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Vol XXXV, Part B4. Istanbul 2004 
  
Important information can be anticipated already by means of 
the keywords. However there is still no knowledge about the 
distribution and location of the geometrical elements, their 
connections to each other, their accumulation in special places 
and so on. Those characteristics make the information of a data 
set complete and allow humans to interpret data. This is the 
ambition of the next step, namely to extract implicit information 
from data sets and making them visible in the internet, 
especially for search engines. 
4. EXTRACTION OF IMPLICIT KNOWLEDGE WITH 
DATA MINING 
As mentioned in the above chapter especially the keywords are 
a first approach to get some semantic information. However 
these keywords have a big drawback. They are still 
interpretable only by human beings. Still expressions like 
*Autobahn", *Aérogare* or "Hospital" are characterless to the 
computer. We would need a translation in two respects: first a 
language translation, but moreover a semantic translation. 
Those catalogues, which describe the meaning of a word and 
determine its sense depending on the context, are called 
ontologies. 
To enrich the ontology our ambition is focused on teaching the 
computer to learn spatial concepts and to combine knowledge to 
higher concepts automatically. They are hidden in the spatial 
data, less to find on the level of pure geometry, but rather 
inherent by the combination and interaction of the spatial 
elements. Spatial data mining is the approach to extract those 
implicit information. 
Needless to say, upon finding those implicit spatial structures 
still the computer does not know the meaning of "Autobahn". 
However the concept is learnt, that “Autobahn” is a major road 
(which has own concepts as well), has less junction points and 
is situated rarely inside of settlement areas, but rather in 
peripheral areas. 
Next we will introduce those implicit structures and concepts, 
which could be useful for a search engine. Afterwards we will 
describe procedures and algorithms to discover inherent 
information with data mining and will document first 
approaches and results. 
4. Implicit Data 
As Aristoteles put it: the whole is more than the sum of its 
parts, the content of a spatial data set is more than only the pure 
geometry. Cognitive structures of human beings fit to the world, 
because they were formed by adaptation to the world. Up to 
now computers do not have this semantic knowledge of the 
world. The challenge is to reproduce such an adaptation process 
by learning automatically. 
Considering typical queries to a search engine and user 
scenarios with spatial background, there is a lot of helpful 
information stored in data sets. E.g. a user would like to search 
for a hotel in the centre of the city, at least the search engine has 
to know, where the city centre is located. This knowledge can 
be discovered in vector data, but it is usually not explicitly 
stored in an item. 
In figure 3 you can see topographic elements of a small village, 
like roads and houses. However, this is already an interpretation 
by humans. You have to be aware, that actually you just can 
spot some lines and polygons, which are differently coloured. 
That is the prior information the computer is able to get out of 
the data. 
337 
  
Figure 3. Where is the city centre located? 
Indeed we recognise streets and houses and we are able to 
reason further facts. Humans can locate the church by the 
special shape of this building. The interaction of the streets and 
houses and their concentration induces at least the information, 
that it is a village. We also can identify larger buildings in the 
upper part and distinguish them from smaller ones in the south. 
A computer can calculate these facts too. The big challenge is 
the following reasoning process. Humans interpret the larger 
buildings as the inner part of the village, because they know 
about old farmyards and the typical formation of a village (in 
Germany). The smaller buildings represent a colony of one- 
family houses. We are able to locate the main street leading 
through the village as well, because of the structure of the 
settlement. Therefore humans can detect the city centre 
approximately without difficulty. 
There is a plenty of examples and ideas, which would be useful 
in SPIRIT. At least we would like to concentrate on some 
concepts mentioned below: 
- classification of more or less important cities 
- sphere of influence of cities 
- . detection of the centre of a city 
- determination of tourist areas and attractive 
destinations 
possibilities of suburban or industrial settlement, 
urban development, quality of housing 
The information available in the data set, which we consider to 
exploit in those concepts together with the necessary operations 
to extract and combine the information is described in Heinzle 
et al. (2002). ; 
Some characteristics of the elements can be determined with 
simple GIS functionality like to calculate an area/size or to 
count the existence of special objects. The evaluation of other 
properties, like density, distribution or neighbourhood, is more 
complicated. The analysis of distances is an essential part to get 
knowledge of these aspects. However, the handling of threshold 
values or absolute numbers is less helpful, because it depends 
on the context, if an attribute or a characteristic is really 
specific and outstanding. Most of the time those values are of 
interest and shed light on something, which distinguish 
themselves and excel at special properties in contrast to the rest 
of the data. Clustering algorithms can be used to identify groups 
of elements respectively their neighbourhood. Among 
clustering algorithms those are preferable that do not need 
threshold values (Anders, 2003). 
Moreover the combination of properties and their calculated 
values raise a problem. Logic operations have to be extended by 
weighting and quantifiers, which depend on the importance, 
relevance, quality of the attribute values and significance of 
elements. 
  
 
	        
Waiting...

Note to user

Dear user,

In response to current developments in the web technology used by the Goobi viewer, the software no longer supports your browser.

Please use one of the following browsers to display this page correctly.

Thank you.