A on May
of Chiba
ntries and
idard land
‚use users
'gested to
end must
time.
" Working
ite change
grassland
approach
rends and
0k)
tility and
nds to be
gends for
s detailed
opment of
'e a use in
ommunity
d removal
(2a) Accuracy / Quality
Chairman: David A. Hastings
NGDC, USA
An overview of this discussion can be summarized by the following two points:
1. Do data actually represent what they claim to represent?
2. From their metadata/documentation, can we learn what we need to know about the data? Or is the
documentation merely written as a diary of the developer's efforts (like a secondary school chemistry
laboratory report to the teacher)?
Individual discussion topics:
1. In the past, documentation served mainly as a diary to remind the producer of the process. Little
soul searching of possible artifacts and implementations of these artifacts was included in
documentation. Indeed, production schedules and budgets for data rarely permitted adequate
resources for such soul searching. But in current times, when someone can spill hot coffee on her
lap and successfully sue for damages from the producer of the coffee, documentation for data may
require an overhaul. Disclaimers may need to be included with data - with such disclaimers backed
up with substantial documentation educating the user about possible consequences of misuse of the
data.
2. Documentation, metadata, visualizations of data, comparative "validation" discussions should all
be inseparable from a dataset. Version numbers for data (similar to version numbers for software),
dates, authorship, peer-review, history of development (including documentation of all stages of
revision) that is sufficiently detailed to permit repetition of the data processing, should all be bundled
with the data. Version control (ultimately at the grid-cell or object level) should be included. One
possibility would be for a complete suite of files to accompany a dataset, such as in a UNIX tar file,
or a DOS "zip"ped file. Associated metadata and documentation should evolve into an anticipated
format, so that users will know that they have not received the entire package of certain information is
missing. NOAA/NGDC's Global Change Data Base includes data, metadata, and documentation as a
beginning prototype of this approach. Other partial models of embedded metadata with data are the
Hierarchical Data Format (HDF).
3. Are current descriptions of map accuracy standards viable for analogue or digital data? There was
a nagging suspicion that such standards have received inadequate challenge, or adaptation to user
needs and producer realities.
4. Error propagation needs to be handled better in database documentation & metadata.
5. The generation of datasets, and the accessing of accuracy of such datasets are often entirely
different processes. Indeed, in many cases, they should be independent, and conducted by different
groups of people.
6. Accuracy/quality vs. temporal/spatial character of data should be facilitated by metadata standards,
and should be automatically part of dataset documentation. Included should be descriptions of
artifacts, and discussions of the impacts of these artifacts. Dataset descriptions should not be limited
— 7 —