- UI MV
—
"a
Oo 3
eo e T ue
to
Ye
re
dy
re
UY
tio
ng
ral
he
ey
ant
jor
nts
ary
rge
initial effort should produce descriptions of 6000 to 10,000 documents of this type, with annual
increments of 1000 to 2000.
According to the configuration chosen for the implementation of the network discussed below, we can
envision a database essentially generated by downloading and reformatting journal articles, and by
"local" input for other types of documents and for articles from Europe and developing countries. It is
difficult to estimate the number of reformatted articles, as this would depend on future agreements, but it
will be recalled that RESORS has 60,000 records, Pascal 16,000 and Geobase 8000. It can be noted
that the cost of reformatting the CEGET database, merged with IBISCUS in 1987-8, is estimated at FF
17.00 per record, excluding the cost of the magnetic media and on the basis of 25,000 records. This
information can be used to guide later agreements.
3.1.2 - Analytical or non-analytical?
This fundamental issue deserves serious consideration. The respective arguments are well-known;
most databases, for example RESORS, are non-analytical (i.e. do not contain abstracts), while PASCAL
and GEOBASE are generally analytical. It is thus clear that reasonably detailed indexing is necessary.
We suggest that a basic indexing thesaurus be implemented fairly rapidly so that the first 500 records
entered in the database can be indexed. Such basic indexing with controlled terms will be accompanied,
if necessary, by additional indexing using candidate descriptors. After a careful evaluation of the
frequency of usage, these will enable a more comprehensive and sophisticated thesaurus to be
produced. This is now the conventional method by which thesauri evolve dynamically.
The feasibility of producing a basic indexing thesaurus shows that the job is partly facilitated by previous
works, including the RESORS dictionary, the GDTA classification scheme, the ITC remote sensing CDU
and the PASCAL-GEODE glossary (specific thesauri), and the IBISCUS-CEGET Thesaurus for generic
terms. We propose that a preliminary digest be made of these glossaries, to be translated into the three
ISPRS-IRS languages (English, French and German).
The final specifications will specify the potential and the cost of machine translation of the keywords. The
advantage is that indexing can be done in one language with the keywords being translated at a later
stage. This will partly depend on the type of management office and on the host to be adopted.
The physical description of the documents (cataloging) will take account of the currently-recognized
international standards in force. A cataloging and indexing/scanning manual and a form will be produced
by the working group under the framework of the executive guideline (this work is also linked to the
choice of alternatives put forward in S 4).
The indexing of a record, the preparation of a form and the data entry itself represent on average 30-35
minutes of work. The creation of a record with an abstract requires 50-60 minutes. This information
should permit an evaluation of the human resources that the members of the network will need to
provide. These facilities have an impact on the choice between analytical and non-analytical.
3.1.3 - Official language
Alternative indexing language(s) were mentioned above, together with the possibility of machine
translation using suitable software and multilingual glossaries, as done by PASCAL. Decisions must
nevertheless be taken as to (a) whether records are to be presented both with the original title and with
the title translated into a to-be-determined language, (b) whether the bibliographic details are to be
recorded in English (or in French or German), and (c) in which language a possible abstract is to be
drafted, this depending on whether the database is analytical or non-analytical (the use of two languages
could also be considered for the drafting of abstracts). These points require further attention.
3.2 - Access to database, dissemination
3.2.4 - Types of products
The products most frequently produced by computerized systems are well-known, as they tend to be
offered by the majority of producers: interactive interrogation, answer-providing services and Selective
Dissemination of Information (SDI), and printed bulletins with several indexes obtained by sorting on one
or more areas. Decisions on such matters are closely related to the type of management office to be set
up, since an SDI answer-providing service, though very useful, in particular for facilitating access from
33