Full text: Proceedings, XXth congress (Part 4)

International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Vol XXXV, Part B4. Istanbul 2004 
  
These models are either given by hand or can also be acquired 
using machine learning approaches (Sester, 2000). The 
interpretation of vector data sets is a fairly new application. It 
has mainly been investigated in the context of spatial data 
mining (Koperski & Han, 1995). 
3. METADATA DESCRIPTIONS OF SPATIAL DATA 
SETS 
3.1 Metadata in SPIRIT 
In metadata information about spatial data sets can be stored. 
Metadata are structured data to describe resources and to enable 
users or agents to select and assess the data. However, there are 
two major problems: 
The expressiveness of metadata highly depends on the used 
scheme. Many existing schemes define the content more or less 
strictly. The ISO 19115 standard (ISO/TC-211, 2003) is 
designed especially for geographical data sets. The metadata 
used in SPIRIT are highly conforming to this existing 
international standard. However we identified a set of metatags, 
which are of essential importance for SPIRIT. 
Secondly the enrichment with metadata still is a process, which 
has to be done manually for the most part. Although there are 
some tools supporting the data entry by using interfaces and 
predefined lists of terms, the costs of manpower and time input 
to enter the data are still almost insurmountable obstacles. This 
leads to the fact that only few web sites and information 
resources are enriched with metadata. For this reason tools to 
generate metadata automatically would be preferably. We will 
illustrate this ambition on the example of ArcView projects and 
shape files. ; 
3.2 Automatic Extraction of metadata 
For SPIRIT, we considered the following metatags as of high 
importance: name, spatial extent, keywords, contact and 
resolution. In this chapter we will illustrate the automatic 
extraction of metadata from ArcView shape files. Hereby of 
special relevance is the discovering of keywords regarding the 
stored spatial elements. 
From ESRI shape format the following information can be 
extracted easily: 
- minimum bounding box 
- number of geometrical elements 
- . type of geometrical elements, like point, line, polygon 
- information about the attributes and their structure, 
like name. tvpe 
That information is important for the interpretation of the 
geometrical aspect of a data set. Indced it docs not tell us many 
things about the semantics of the data. Particularly if the names 
of the predicates are coded by numbers or like in the 
abbreviated example given in table 1, the primary information 
of the shape files is insufficient. 
SHAPE AOBJID TEIL OBJART OART ATYP 
PolyLine N01CZ70 . 001 3102 3102 
PolyLine NOICZIS 002 3105 3105 1301 
PolyLine N20LHCN 001 3106 3106 
Table 1. ATKIS-record, Excerpt of the adequate dbf file 
From this, it is not apparent, that this data represents a road 
network, which is displayed in figure 1. 
At least it is necessary to know, which data are coded in the set 
to be able to provide an internet user the right information. Up 
to date we only know about the type of elements, for example 
there are lines, but we do not have knowledge whether the lines 
are streets, pipelines, administrative borderlines or contour 
lines. To detect this information, we analyse shape files and if 
there is a legend available, more information can be extracted 
from the ArcView project file to derive automatically adequate 
keywords. The following example documents the process. 
Figure 1. Road network data set 
In figure 2 the automatically extracted metadata are shown. 
E FTSimpleDisplay Metadata 
  
<Metadata> A 
<Name> Strabenverkehr (104) - Objektteil-Linien </MN ame> 
<Path> u:/atkis/jade_weser_port/arcview/F104_It.shp </Patho 
<CreationD ate> 15. April 2004] 16:56:52 <#CreationD ate> 
«Keywords» 
220 Straße 
20 Bundesautobahn 
56 Landesstraße, Staatsstraße 
49 Forststraße 
331 Gemeindestrabe 
sonst. Strafe 
u 
223 Weg 
51 Fahrbahn 
«Keywords» 
«Number of Entities? 1611 «/Mumber of Entities» 
<ShapeType> Polylines </ShapeType> 
<Min <> 3437212 720 </Min X> 
<Min Y> 5934313.340 </Min Y> 
Maw %> 3445062260 </M ax X> 
<Max Y> 5944527390 «/Max Y» 
</Metadata> : zi 
Figure 2. Metadata for the displayed road network data set, 
distinguishing different types of road (in German) 
  
  
  
All available data are analysed to acquire the keywords. Text 
files are checked to identify street names and designations of 
regions. Captions often give a glimpse of the character of the 
stored geographical elements, as well as the names of the 
attributes in the dbf files. 
The spatial extent of the data set is determined by the minimum 
bounding box. Moreover there are also some indicators to infer 
the scale or the level of detail of the data set. Analysing only 
the geometry of features, a simple measure for the scale of a 
data set can be the distance between the individual points a line 
or a polygon is composed of. Furthermore, the existence and 
type of certain geographic elements also give rise to a certain 
resolution, e.g. typically buildings are only present in large 
scales; in large scales roads are typically represented as areal 
objects whereas in small scales they are given in forms of 
polylines. 
Internat: 
Fines ree 
Importai 
the key! 
distribut 
connecti 
and so c 
set com 
ambitioi 
from d 
especial 
4. EX 
As men 
a first : 
these | 
interpre 
“Autob: 
comput: 
languag 
Those « 
determi 
ontolog 
To enri 
comput 
higher 
dala, le 
inheren 
element 
implicit 
Needles 
still the 
Howe vi 
(which 
is situa 
periphe 
Next w 
which « 
describ: 
informe 
approac 
41 In 
As Ari 
parts, tl 
geomet 
because 
NOW CC 
world. ' 
by learı 
Consid: 
scenari 
inform: 
for a hc 
to kno 
be disc 
stored i 
In figut 
like roa 
by hun 
spot so 
That is 
the dat: 
2 
o2 
ON 
 
	        
Waiting...

Note to user

Dear user,

In response to current developments in the web technology used by the Goobi viewer, the software no longer supports your browser.

Please use one of the following browsers to display this page correctly.

Thank you.