ges, slope
ften more
in digital
idard and
uring the
ly catered
unfunded
nses have
" without
ital Line
ranslators
rehensive
) benefit
hypertext
uld assist
f such
. There is
ussion.
on control
ld be able
ation.
und to set
papers to
. Another
iscussion.
ling to the
Congress.
ment such
tize them.
Congress
(2b) Data Availability, Data Access, and Dissemination Rights
Chairman: Dennis S. Ojima
Colorado State University, USA
Rapporteur: Robert Kremer
Colorado State University, USA
The group discussed the three areas assigned to be of concern to producers, archivists, and end users
of global scale data sets: Availability, Accessibility, and Legality/Dissemination. While the Legality/
Dissemination category is self explanatory, the difference between Availability and Accessibility
warrants some clarification. In addition to these topics, issues related to data base documentation and
authorship of data sets were brought out as issues of importance.
The term "availability" was used in reference to spatial data sets that have been, or will be, created to
be generally usable by any end-user community such as modelers, economic analysts, or
biogeographers. The term "accessibility" refers to the logistics and mechanisms by which a recipient
of available data may actually learn of and procure the data for use. Although these three categories
are distinct, and each is necessarily addressable as a unique problem, it became apparent almost
immediately in the ensuing discussion that there is considerable overlap among the three, and the
discussion became a synthesis of critical concerns in the creation and dissemination of global data
sets.
A key issue that was introduced at the onset was that of academic or publication credit given for
creators of data sets that may have broad scientific applicability. Currently there is no system of
reward or recognition for the incredibly time-consuming and integral task of processing and
manipulating data. It was of general opinion that the peer-reviewed journal community generally
treats data-creation manuscripts with luke-warm response, and the best that people who spend large
amounts of time processing and/or creating data can hope for is co-author status after the data has
been incorporated into applied research, which can be years later. The group proposes that data
processing be treated as more of a publishable science when manuscripts are submitted describing
widely applicable techniques or usable products. This discussion spurred some relevant concerns
about the actual "peer review" process when it comes to publishable data sets.
Scientific peer review of large or complex data sets can be a formidable task if the review entails
testing the actual procedures used to create data sets, such as mathematical derivations of spatial
digital images. Therefore, whether a data set is made publicly accessible through technical reports,
online information, or through journal publication, the peer review process should focus on scrutiny
of the documentation, the applicability of processing procedures to the specific data set being created
or the original data that has been compiled or manipulated, and the validity of any and all assumptions
the authors may have made. Conversely, documentation should provide utmost detail of all
assumptions and procedures used to create the data. (This, however, relates back to the problem of
much time spent with no academic credit.) In other words, due to the unique problem of the
complexity and wide range of approaches used to process and create data, special considerations need
to be made as to what constitutes adequate peer review; however, this does not necessarily need to
result in more lax scrutiny - just that the focus should be on the validity of procedures involved.
Another concern of the group was a lack of knowledge of what data is actually available. The
discussion revolved largely around the Internet as the present and future vehicle for the dissemination
— 9 —