Technical ReportPDF Available

Data model

Authors:

Abstract and Figures

This report provides an introduction to the development of a Renardus Application Profile. It is a reference to the partners' answers of the D6.4 questionnaire developed by SUB. The answers lead into the development of several data models: a data model of the Renardus prototype pilot system, a first version of the data model for the operational pilot system, and a data model for the administrative database. This database contains, besides the mapping tables for cross-browsing, tables for the conversion of some codes to the defined Renardus codes, and the collection description of each subject gateway. Finally, this report contains some upgrade recommendations for partners` metadata information. Keywords data model, data flow, subject gateway, metadata, profile, application profile, namespace, Renardus, Reynard Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000 Reynard IST-1999-10562 2 Distribution List: All partners Issue: 1.0...
Content may be subject to copyright.
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 1
RENARDUS: PROJECT DELIVERABLE
Project Number:
IST-1999-10562
Project Title:
Reynard - Academic Subject Gateway Service Europe
Deliverable Type:
Internal
Deliverable Number:
D6.4
Contractual Date of Delivery:
30 September 2000
Actual Date of Delivery:
17 November 2000
Title of Deliverable:
Data model (first final version 1.0)
Workpackage contributing to the Deliverable:
WP6
Nature of the Deliverable:
Report
URL:
http://www.sub.uni-goettingen.de/ssgfi/reynard/wp6/d6.4/index.html
(restricted access)
http://renardus.sub.uni-goettingen.de/ (public access)
Authors:
Hans Jürgen Becker, Frank Klaproth, Heike Neuroth
Contributions:
Michael Day (UKOLN, text); Anders Ardo and Traugott Koch
(DTV/NetLab, discussions).
Contact Details:
Platz der Göttinger Sieben 1
37073 Göttingen
Germany
email: eu-fuchs@www.sub.uni-goettingen.de
Abstract
This report provides an introduction to the development of a Renardus
Application Profile. It is a reference to the partners’ answers of the
D6.4 questionnaire developed by SUB. The answers lead into the
development of several data models: a data model of the Renardus
prototype pilot system, a first version of the data model for the
operational pilot system, and a data model for the administrative
database. This database contains, besides the mapping tables for
cross-browsing, tables for the conversion of some codes to the
defined Renardus codes, and the collection description of each subject
gateway. Finally, this report contains some upgrade recommendations
for partners‘ metadata information.
Keywords
data model, data flow, subject gateway, metadata, profile, application
profile, namespace, Renardus, Reynard
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 2
Distribution List:
All partners
Issue:
1.0
Reference:
IST-1999-10562 / D6.4 / 1.0
Total Number of Pages:
62
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 3
TABLE OF CONTENTS
PART I TITLE PAGE
1 RESULTS 14
1.1 Agreement on eight elements 14
1.2 Results of the second questionnaire developed for D6.4 15
1.2.1 Eight Elements for Cross-Searching 16
1.2.1.1 General (0) 16
1.2.1.2 Title (1) 17
1.2.1.2.1 Title/Title.Alternative (1.1 – 1.6) 17
1.2.1.3 Creator (2) 18
1.2.1.3.1 Creator: general (2.1) 18
1.2.1.3.2 Creator: rules (2.2 – 2.9) 18
1.2.1.3.3 Creator: additional information (2.10 – 2.16) 18
1.2.1.4 Description (3) 19
1.2.1.4.1 Description: general (3.1) 19
1.2.1.4.2 Description: description + keywords (3.2 – 3.5) 19
1.2.1.4.3 Description: multilinguality (3.6) 19
1.2.1.5 Subject (4) 19
1.2.1.5.1 Subject: keywords – general (4.1 – 4.2) 19
1.2.1.5.2 Subject: form of keywords (4.3 – 4.7) 20
1.2.1.5.3 Subject: keywords – multilinguality (4.8) 20
1.2.1.5.4 Subject: keywords – rules (4.9) 20
1.2.1.5.5 Subject: classification – general (4.10 – 4.15) 20
1.2.1.5.6 Subject: classification system - cross-search with regard to a special subject classification
(4.16 – 4.20) 21
1.2.1.5.7 Subject: classification systems – multilinguality (4.21) 21
1.2.1.6 Identifier (5) 21
1.2.1.6.1 Identifier: general - regarding resources in several languages (5.1 – 5.2) 21
1.2.1.6.2 Identifier: general - regarding mirrored/copied resources (5.3 – 5.5) 21
1.2.1.6.3 Identifier: Qualifier (5.6 – 5.9) 22
1.2.1.7 Language (6) 22
1.2.1.7.1 Language: general (6.1) 22
1.2.1.7.2 Language: code (6.2 – 6.4) 22
1.2.1.8 Country (7) 22
1.2.1.8.1 Country: general (7.1 – 7.3) 22
1.2.1.8.2 Country: code (7.4 – 7.5) 22
1.2.1.9 Type (8) 23
1.2.1.9.1 Type: general (8.1 – 8.5) 23
1.2.2 Future Elements 23
1.2.2.1 Rights (9.1 – 9.7) 23
1.2.2.2 Publisher (10) 23
1.2.3 Additional Elements 24
1.2.4 Administrative Elements 24
1.2.4.1 Subject Gateway ID (IV A) 24
1.2.4.2 Unique Record Number (IV B) 24
1.2.4.3 Record Creator (IV C) 24
1.2.4.4 SBIG ID (IV D) 24
1.2.4.5 Record Last Checked Date (IV E) 24
1.2.4.6 Other (IV F) 24
1.3 Subject Gateways in the UK 24
1.3.1 RDN 25
1.3.2 Individual RDN hubs 26
1.3.2.1 BIOME 26
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 4
1.3.2.2 EEVL 27
1.3.2.3 Humbul 27
1.3.2.4 PSIgate 27
1.3.2.5 SOSIG 27
2 DATA MODEL AND DATA FLOW 27
2.1 Data model for the prototype Renardus pilot system 28
2.1.1 Dublin Core Elements 29
2.1.1.1 DC.Title and DC.Title.Alternative 29
2.1.1.2 DC.Creator 30
2.1.1.3 DC.Description 31
2.1.1.4 DC.Subject: classification system(s) and keywords 32
2.1.1.5 DC.Identifier 33
2.1.1.6 DC.Language 34
2.1.1.7 DC.Type 34
2.1.2 Non Dublin Core element 36
2.1.2.1 Country 36
2.1.3 Administrative Renardus elements 36
2.1.3.1 Full Record URL 36
2.1.3.2 SBIG ID 37
2.2 Preliminary version of data model for the operational Renardus pilot system 38
2.2.1 Dublin Core Elements 39
2.2.1.1 DC.Title and DC.Title.Alternative 39
2.2.1.2 DC.Creator and DC.Creator.AddinionalInformation 40
2.2.1.3 DC.Description 41
2.2.1.4 DC.Subject: classification system(s) and keywords 42
2.2.1.5 DC.Identifier 43
2.2.1.6 DC.Language 45
2.2.1.7 DC.Type 46
2.2.2 Non Dublin Core element 48
2.2.2.1 Country 48
2.2.3 Administrative Renardus elements 48
2.2.3.1 Full Record URL 48
2.2.3.2 SBIG ID 49
2.3 Data model of the administrative database: Collection Level Description (CLD) 49
2.4 Data flow 52
3 Appendix A: Questionnaire Renardus questionnaire D6.4: Data model and
data flow (http://www.sub.uni-
goettingen.de/ssgfi/reynard/wp6/d6.4/questionnaires/all.html) 54
4 Appendix B: Responses Questionnaire: Responses from the partners
(http://www.sub.uni-goettingen.de/ssgfi/reynard/wp6/d6.4/index.html) 54
5 Appendix C: Comments of Partners 54
6 Appendix D: Summary Summary of responses (matrix): http://www.sub.uni-
goettingen.de/ssgfi/reynard/wp6/d6.4/summary_d6_4.pdf 60
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 5
7 Appendix E: Data Model and Data Flow Data model and data flow, draft version
0.3 (4. September 2000) http://www.sub.uni-
goettingen.de/ssgfi/reynard/wp6/d6.4/data_model.pdf 60
8 BIBLIOGRAPHY 60
9 REFERENCES 61
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562
6
PART II - MANAGEMENT OVERVIEW
DOCUMENT CONTROL
Issue Date of Issue Comments
0.1 10 May 2000 First draft presented to partners on Bath meeting (excel sheet)
0.2 12 May 2000 Second draft, presented on first SCHEMAS workshop in Bath
0.3 8 September 2000 Third draft, for review by project partners on Paris meeting
0.4 6/7 November 2000 Fourth draft, for review by project partner on Göttingen meeting
1.0 17 November 2000 First final version
EXECUTIVE SUMMARY
The object of the Renardus project is to establish an academic subject gateway service in Europe. The pilot
system will be based on a generic broker-architecture and data-model that will allow the integrated searching
and browsing of distributed resource collections.
This report will provide background information about the development of the Renardus data model and data
flow. It is a reference to the partners’ answers of the D6.4 questionnaire developed by SUB. Michael Day
(UKOLN) presents basckground information about RDN and the individual hubs. The answers lead into a data
model of the Renardus prototype pilot system and a first version of the data model for the operational pilot
system.
The questionnaire was provided to the following ten partners: DutchESS (The Netherlands), NOVAGate
(Nordic countries), EELS (Sweden), DEF fagportal (Denmark), DAINet (Germany), FVL (Finland), Les Signets
(France), RDN (United Kingdom), DDB (Germany) and SSG-FI (Germany).
The answers of the partners are summarized in the following list, only those responses with the highest priority
(required, strongly recommended and recommended) are considered:
Title/Title.Alternative:
- The main Title should not be repeatable, Title.Alternative element should be repeatable, Title and
Title.Alternative should be both cross-searchable.
- Title should be provided in the language of the resource and additional titles (translated title, acronym, etc.)
should be provided in repeatable Title.Alternative elements.
Creator:
- Creator should be a repeatable element.
Description:
- Description element should be repeatable in case the description is provided in more than one language.
- Each Subject Gateway should provide either an English version of Description or an English version of
Keywords for every resource (beside other languages).
Subject:
- Keywords should be browsable and repeatable.
- All forms of the repeatable element Keyword (free, controlled, thesaurus based) should be provided and the
form of Keywords should be indicated for the user.
- The Subject Gateways should be browsable via a common Classification System, Renardus should use an
existing Common Classification system and this system should be DDC (all partners map their system to
DDC). The Classification System should be provided in several European languages.
- Verbal description of each notation (caption) should be indexed together with keywords, so users can search
both; besides the common Classification System, Renardus should provide subject classification systems
like MSC, Ei: cross-searchable via notation and captions as well.
Identifier:
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 7
- Identifier should be repeatable and searchable if the resource is provided in more than one language with
different URLs.
- Renardus should integrate URLs, ISBNs, ISSNs, PURLS in Identifier elements with different qualifiers.
Language:
- Language element should be repeatable and the language code should be the ISO Code 639, three letters.
Country:
- Country should reflect the publisher country and the country code should be ISO Code 3166, two letters.
Types:
- Renardus should develop a common list of Types (controlled list) and the common list of Types should be
based on the Dublin Core type list.
Future Elements:
- Renardus should support the Rights element in the future (in the sense of IPRs, Rights should contain
information about access conditions/restrictions of the resource and should contain copyright/IPR
information of the resource as well). Rights should be a repeatable element for different kinds of
information (access conditions/restrictions, subscription information, copyright, IPR, etc.).
- Renardus should use the element Rights with different qualifiers for different kinds of information
- Renardus should support in the future a Publisher element
On the basis of partners’ answers several data models have been developed. The Renardus broker system will
consist of two databases:
1) Renardus decentral content database, which contains records extracted from each individual Service
Provider (can consist of several Subject Gateways). The data model for this database consists of seven well
defined metadata elements, which are based on Dublin Core, one non-DC metadata element (Country), and
two administrative elements (Full Record URL and SBIG ID).
There are two versions of the data model: One version is for the prototype pilot system and the second is for the
operational pilot system. The following figures provide the Renardus metadata elements for these two systems
(M=mandatory, R=strongly recommended, O=optional, NR=not repeatable, R=repeatable, LQ=Language
Qualifier):
Prototype Pilot System:
Metadata Element Obligation Repeatable LQ Comments
DC.Title
M
NR possible -
DC.Title.Alternative
O
R possible -
DC.Creator
R
R no Last name and first name should be clearly
distinguishable.
DC.Description
M
R possible For cross-search reasons the field description
must contain free text.
DC.Subject
M
R possible In the prototype system there will be no further
distinction between the several kinds of subject
(keywords, classification system).
DC.Subject:DDC
M
R no DDC 21: adapted DDC version for cross-
browsing puporse. Only captions and not
notations will be displayed
DC.Identifier
M
R no In the prototype system no distinction will be
made between resource URL, mirrored, copied
resource URL(s) and URL(s) for archive
reasons.
DC.Language
R
R no The language code is the ISO 639-2, three
letter code.
DC.Type
R
R no Subject Gateways should provide their original
types without encoding scheme.
DC.Type.DCT1
R
Rno-
Country
R
NR no 3166-1 (two letter code)
Full Record URL
R
NR no A URL that leads to a detailed display of each
record at the originating service site.
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 8
SBIG ID
M
NR no A stable unique acronym also well defined in
the Collection Level Description.
Operational Pilot System:
Metadata Element Obligation Repeatable LQ Comments
DC.Title
M
NR yes Title should be the original title. It is strongly
recommended to provide only one version of
title in this field.
DC.Title.Alternative
O
Ryes-
DC.Creator
R
R no Last name and first name should be clearly
distinguishable.
DC.Creator.
Additional.
Information
O
R no Additional information like Email, URL,
Organisational Information.
DC.Description
M
R yes For cross-search reasons the field description
must contain free text.
Strongly recommended: Each SG should
provide either an English version of description
or an English version of keywords for every
resource (beside other languages).
DC.Subject
M
R yes In the operational system there will be made a
distinction between the several kinds of subject
(keywords, classification system).
For the final system the provision of keywords
is required.
DC.Subject:DDC
M
R no DDC 21: adapted DDC version for cross-
browsing puporse. Only captions and not
notations will be displayed
DC.Identifier
M
R no In the operational system a distinction will be
made between resource URL, mirrored, copied
resource URL(s) and URL(s) for archive
reasons.
DC.Identifier. Mirror
O
Rno-
DC.Identifier.
Archive
O
NR no -
DC.Language
R
R no The language code is the ISO 639-2, three
letter code.
DC.Type
R
R no Subject Gateways should provide their original
types without encoding scheme.
DC.Type.DCT1
R
Rno-
DC.Type.DCT2
O
R no The possibility and usability of a mapping to
DCT2 will be investigated in WP 7.
Country
R
NR no 3166-1 (two letter code)
Full Record URL
R
NR no A URL that leads to a detailed display of each
record at the originating service site.
SBIG ID
M
NR no A stable unique acronym also well defined in
the Collection Level Description.
2) Renardus administrative database, which contains the collection description of each subject gateway, the
mapping tables for cross-browsing the metadata via the common classification system DDC, some codes
(probably language, country, and type) for conversion to the defined Renardus codes. The metadata
elements for this kind of database are based on the RSLP collection description schema. The aims of the
collection description are to support the selection of subject gateway(s) for searching, to provide
background information about the participating subject gateway for human and machine users, and to
promote/register the individual subject gateway(s) as high quality resources in the Internet. The following
list provides the elements of the Renardus Collection Level Description schema:
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 9
Title: The name of the collection.
Identifier: An unambiguous reference to the collection within a given context.
Description: An account of the content of the collection.
Language: The main language(s) of the metadata in the collection with quantitative indication.
Publisher: An entity responsible for making the collection available.
Format.Extent: The size of the collection.
Date.Issued: Date of formal iisuance (e.g. publication) of the collection.
Subject: The topic of the content of the collection.
Subject Notation: The topic of the content of the collection.
Relation: A reference to a related resource.
Country: The country in which the collection is physically located.
Acronym: The acronym of the collection.
Resource Language: Language(s) of the described resources.
DDC mapping URL: URL of local DDC mapping information in Renardus format.
Z39.50 Location: The online location of the Z39.50 server of the subject gateway.
Logo URL: The URL of the logo (image) of the subject gateway.
Some recommendations for upgrade processes for partners’ metadata information are provided: In case the
element Keyword is not yet an element in partners’ datamodel for the normalization process it is recommemded
to provide this element first. For the future it is required that the title will be provided in the original version,
other forms of title could be given in the title.alternative field. It is still undecided if in the future it will be
required to provide an English version of the title, either in the Title field or in the Title.Alternative field.
Considering that all partners should support an element it is further recommended that all partners support the
country element. It seems to be easier to extract the country code from the domain of a URL than to support a
language code.
In conclusion, if partners have to upgrade their metadata information it is strongly recommended to include first
keywords, than country followed by type and language.
All three data models will be updated in the future; so during the next months the several data models will lead
into a final version of the Renardus Application Profile, which will be described in the public report D6.5, to be
delivered in June 2001.
SCOPE STATEMENT
This report is the second internal deliverable (beside two public deliverables: D6.1 and D6.2) to be issued by
WP6 (Data model and data flow) of the Renardus project. The objective of WP6 is to develop the data model
that will underpin the Renardus system.
The aim of the questionnaire gateway survey was to analyse the gateway structures and formats of the Renardus
partners. These should lead to the setup of a generic service profile that is needed to record all types of
information about a gateway service. The inventory of the participating services is necessary for the
specifications of functional requirements of the data model (D6.3) and for building the data model (D6.4/D6.5).
This report provides also important features for WP 1 (functional model) and WP 2 (design and
implementation).
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 10
PART III - DELIVERABLE CONTENT
INTRODUCTION
This report will provide background information about the development of the Renardus data model and data
flow. It is a reference to the partners’ answers of the second questionnaire. This answers lead into a data model
of the Renardus prototype pilot system and a first version of the data model for the operational pilot system.
The Appendix contains the data provided by the partners, the dynamically generated metadata mapping and
overviews of keywords and classification systems (dynamically generated access databases).
The data model and data flow will be extended by the discussions in the Dublin Core Community (e.g. 8th
Dublin Core Workshop) e.g. related to agent. Throughout the runtime of the project corrections and additions
will be worked in, so that the data model and data flow will always be up-to-date.
The report is divided into two main chapters: The first chaper provides an overview about the results of the
second questionnaire. The second chapter introduces the data model for the Renardus prototype pilot system as
well as for the operational pilot system and for the administrative database (collection description) and presents
a first overview about the data flow.
GLOSSARY
AHRB
Arts and Humanities Research Board.
ALUH
Viikki Science Library, University of Helsinki, Finland.
BIOME
The RDN hub for the medicine, health and the life sciences.
BNF
Bibliothèque Nationale de France (National Library of France).
CLD
Collection Level Description.
DAINet
Deutsches Agrarinformationsnetz, Germany.
DC
Dublin Core.
DCMES
Dublin Core Metadata Element Set.
DCMI
Dublin Core Metadata Initiative.
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 11
DDB
Die Deutsche Bibliothek (National Library of Germany).
DDC
Dewey Decimal Classification system.
DEF
Danmarks Elektroniske Forskningsbibliotek. Denmark's Electronic Research Library - a virtual library for
researchers, students, lecturers and other users of Danish research institutions, Denmark.
DNER
Distributed National Electronic Resource - the JISC's concept of a managed environment for accessing
heterogeneous, quality-assured information resources on the Internet.
DTV
Technical Knowledge Centre and Library of Denmark.
Dublin Core
An initiative - sometimes known as the Dublin Core Metadata Initiative (DCMI) - to develop a core metadata
element set to facilitate the discovery of digital (networked) resources. Developments in the element set are
defined on the basis of international consensus.
DutchESS
Dutch Electronic Subject Service, The Netherlands.
EELS
Engineering Electronic Library, Sweden.
EEVL
Edinburgh Engineering Virtual Library - one of the eLib-funded Internet information gateways.
eLib
The Electronic Libraries Programme - a series of UK higher education-based networking projects, funded by the
JISC.
ESRC
Economic and Social Research Council.
EULER
European Libraries and Electronic Resources in Mathematical Sciences - a project funded by the European
Union.
EEVL
Edinburgh Engineering Virtual Library- one of the eLib-funded Internet information gateways.
FVL
The Finnish Virtual Library - Virtuaalikirjasto, Finland.
HUB
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 12
Hubs provide data for RDN. Hubs may be individual organisations or (more frequently) consortia of prominent
library, academic, research and professional organisations.
HUMBUL
The RDN hub for the arts and humanities.
ISO
International Organisation for Standardization.
JISC
Joint Information Systems Committee - a strategic advisory committee working on behalf of the funding bodies
for higher and further education in England, Scotland, Wales and Northern Ireland. Its mission is to promote the
innovative application and use of information systems and information technology in higher and further
education across the UK.
JyU
Finnish Virtual Library Project, Jyväskylä University Library, Finland.
KB
Koninklijke Bibliotheek, National Library of the Netherlands.
LCSH
Library of Congress Subject Headings.
MSC
Mathematics Subject Classification.
NetLab
NetLab, Lund University, Sweden.
NOVAGate
Nordic Gateway to Information in Forestry, Veterinary and Agricultural Sciences, Finland.
OMNI
Organising Medical Networked Information - one of the eLib-funded Internet information gateways. Now part
of the BIOME RDN Hub.
PSIgate
RDN hub for physical sciences. The service is still under development.
RDN
The Resource Discovery Network - the RDN is a co-operative network dedicated to providing access to high-
quality Internet resources for the learning, teaching and research community in the UK. The RDN is co-
ordinated by a team based at UKOLN and King's College London.
ROADS
Resource Organisation and Discovery in Subject-oriented services - originally an UK project funded by JISC
under eLib, ROADS is an open-source software toolkit for Internet subject gateways.
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 13
RSLP
Research Support Libraries Programme.
SG
Subject Gateway in the sense of quality controlled subject gateway, also called sometimes SBIGs (Subject
Based Information Gateways).
SOSIG
Social Science Information Gateway - one of the eLib-funded Internet information gateways, now a RDN Hub.
SSG-FI
SonderSammelGebiets-FachInformationsführer (Special Subject Gateways), SUB Göttingen, Germany.
SUB
Niedersächsische Staats- und Universitätsbibliothek Göttingen (Lower Saxony State and University Library
Göttingen), Germany.
UKOLN
UK Office for Library and Information Networking, University of Bath, UK.
URN
Uniform Resource Name.
ZADI
Zentralstelle für Agrardokumentation und -information, Germany.
Z39.50
An ANSI/NISO protocol for search and retrieval. Version 3 of the protocol has also been accepted as an ISO
standard - ISO 23950.
Z39.85
Draft Standard Z39.85-200X: The Dublin Core Metadata Element Set.
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 14
1 RESULTS
This chapter is divided into three parts: The first part gives a short overview about the agreements made on the
technical meeting in Bath (also fixed in the minutes), the second part summarizes the answers from the partners
to the second questionnaire asking about further details of the data model and data flow like rules, codes,
standards, and the third part provides a short outlook to RDN and the individual hubs. The numbers in brackets
behind the subheadings refer to the corresponding questions in the questionnaire. The comments of partners to
each section of questions can be found in appendix C.
1.1 Agreement on eight elements
After finishing the “Evaluation report of partner subject gateways” (see public version D6.1) partners agreed on
8 elements (at a technical meeting in Bath on 10. May) - without further discussion about rules, codes,
standards, and qualifiers. They also agreed that partner subject gateways will have to support most of these
elements (e.g. if one Subject Gateway supports only 7 of the elements this would be no reason to exclude it), but
this needed more detailed discussion. They agreed further that the data model is based on Dublin Core. These
eight elements are:
- DC.Title - probably title.alternative is repeatable
- DC.Creator - repeatable
- DC.Description - repeatable in case descriptions in several languages are provided
- DC.Identifier: URI - possibly repeatable for mirror sites, but this needs further discussion
- DC.Subject – repeatable and with the need of common classification system (either “home-grown” or
mapped to a general system)
- DC.Language - repeatable (need a common code like ISO 639)
- DC.Type – repeatable: partners will either map their types to Dublin Core types, use DC types with
Renardus specific extensions or develop a “home-grown” list of types with the most common ones
- Country Code - a clear definition is needed, e.g. the publisher country or the country in which the server is
located. Also to need a common code like ISO 3166)
Several reccommendations are formulated for two further elements, after developing the prototype pilot system:
- DC.Publisher: possibly include in the future? Will probably not be included in the pilot system
- DC.Rights: possibly include this element in the future, e.g. to give information about copyright,
access/restriction conditions (could also be necessary if print materials etc. will be included)
- Rights in the sense of IPRs: probably included so that the SGs keep their copyrights of the metadata records
after they are gathered from the broker service In order to specify common rules, codes, standards, and
qualifiers, which can be supported by all Renardus partners SUB developed a more detailed questionnaire
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 15
These results were presented by SUB at two conferences: At the first SCHEMAS workshop on 12. Mai and at
the CULTURAL HERITAGE – CONCERTATION EVENT on 30. June.
In order to specify common rules, codes, standards, and qualifiers for these elements SUB developed a more
detailed questionnaire. In this questionnaire partners were asked for an evaluation of several proposals to qualify
the metadata elements.
1.2 Results of the second questionnaire developed for D6.4
The main purpose of this questionnaire is to gather information about the qualifiers, rules, standards, and codes
of the elements which are supported by the Renardus prototype and the operational pilot system. As the Bath
meeting led only to a basic agreement on eight elements this questionnaire was intended to provide deeper
insight on how to use them. The results lead into the development of the data model.
The questionnaire was sent out on 3. July and partners were asked to send it back to SUB before 14 July.
Because of holidays the last responses arrived at SUB on 24. August.
Two partners (DTV and NetLab) filled in the questionnaire together. Because of the discussion and ongoing
process at RDN about a centralized structure (RDNC) it was not possible to get common (and official)
information from UKOLN, RDN or the single hubs. SUB and UKOLN try to get detailed information on the
basis of the two questionnaires (D6.1 and D6.4) from all RDN hubs. The results will be presented in an updated
version of D6.4.
Some partners did not fill in the questionnaire completely so in case no evaluation was given (e.g. only ‘no’)
they have not been incorporated into the analysis (see also Appendix C) and not are considered here in the
report.
Following Renardus partners filled in the questionnaire:
Name Acronym URL
National Library of the Netherlands KB http://www.kb.nl/
National Library of France BNF http://www.bnf.fr/
National Library of Germany DDB http://www.ddb.de/
Finnish Virtual Library Project JyU http://www.jyu.fi/library/english/index.htm
NetLab, Lund University, Sweden
together with
Technical Knowledge Centre and Library of
Denmark
NetLab
DTV
http://www.lub.lu.se/netlab/
http://www.dtv.dk/
Niedersächsische Staats- und
Universitätsbibliothek, Göttingen, Germany
SUB http://www.sub.uni-goettingen.de/
Viikki Science Library, University of Helsinki,
Finland
ALUH http://helix.helsinki.fi/infokeskus/lib/
Zentralstelle für Agrardokumentation und -
information, Germany
ZADI http://www.dainet.de/zadi/
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 16
The answers of UKOLN and SOSIG will not be considered here. As mentioned above SUB and UKOLN will
prepare a common view of these issues together and present the results in an updated version of D6.4. A short
overview is given in chapter 1.3.
For the questionnaire and the answers provided by each partner, see Appendices A and B.
Partners had the possibility to answer the questions by giving an evaluation in the following way:
required (1)
strongly recommended (2)
recommended (3)
desirable (4)
not necessary (5)
definitely not (6).
It was also asked in most questions if partner subject gateways will support the mentioned rule, code etc. now or
in future. This information will help to find common Renardus metadata element refinements and encoding
schemes.
The numbers in brackets behind the subheadings refer to the corresponding questions in the questionnaire. For
each question the number of SGs which support the meaning in the question now or in future is located behind
each result in brackets.
1.2.1 Eight Elements for Cross-Searching
In this chapter the results of the questionnaire lead into detailed information about rules, codes etc. about the
eight elements (title, creator, description, subject, identifier, language, country, type).
1.2.1.1 General (0)
Partner subject gateways have to support most of the agreed eight elements. To gather information which
elements are required for (future) subject gateways and must be supported, partners were asked to mark these
elements.
The results are summarised in figure 1:
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 17
Figure 1: Evaluation about requirements of Renardus metadata elements.
The following metadata elements must be supported by (future) partners:
title, description, subject: keywords, subject: classification system, and identifier.
If a subject gateway provides no keywords, it could be allowed to generate keywords automatically from the
description field. Generating keywords in this way the quality standards of Renardus has to be considered, e.g.
stop words, controll of automated program.
The following metadata elements are strongly recommended:
creator, language, country and type.
Partners have to consider that most of these elements must be supported. But if for example one element of the
eight Renardus elements can’t be provided by a subject gateway this will be no argument to exclude the subject
gateway from the broker system. Each case has to be negotiated with the Renardus team.
1.2.1.2 Title (1)
1.2.1.2.1 Title/Title.Alternative (1.1 – 1.6)
Partners handle the title field in different ways (see public report D6.1), some partners provide the original title
and translated title in the main title field (e.g. DutchESS), some partners use the title alternative field to provide
translated titles or acronyms (e.g. RDN). Another open issue is the language of title with regard to cross-search
this field.
Required:
- Title and Title.Alternative should be cross-searchable (supported by all partners)
Strongly recommended:
Elements that have to be supported by each SBIG
012345678
Language
Type
Country
Creator
Keywords
Classification
Title
Description
Identifier
Renardus elements
Number of support by partners [max. 8]
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 18
- The main title should not be repeatable (supported by seven partners)
- Title should be provided in the language of the resource and additional titles (translated title, acronym, etc.)
should be provided in repeatable Title.Alternative fields (supported by six partners)
Strongly recommended/recommended:
- The Title.Alternative field should be repeatable (supported by five partners)
Not necessary:
- The main title should be provided in English (for cross-searching) and additional titles (translated title,
acronym, etc.) should be provided in repeatable Title.Alternative fields (supported by one partner)
- If there is no English title provided on the server side should Renardus provide an English version of the
title (done by an automatic translation program)?
1.2.1.3 Creator (2)
Currently the Creator, Contributor and Publisher (collectively called Agent elements) are being discussed within
the DC community. At the moment the proposed agent qualifiers are: Type, Name, Affiliation, Role, and
Identifier (see DC Working Draft - 10 December 1999; http://www.mailbase.ac.uk/lists/dc-agents/files/wd-
agent-qual.html).
SUB will keep an eye on the Agents discussion. Changes will be worked in in further deliverables.
1.2.1.3.1 Creator: general (2.1)
It is strongly recommended that creator should be a repeatable field (supported by all partners).
1.2.1.3.2 Creator: rules (2.2 – 2.9)
Results of the questionnaire with regard to creator rules are:
Recommended/desirable:
- Syntax of creator should be last name, first name in one field, separated by a special character (supported by
four partners)
- Renardus should reuse existing authority files (PND – Germany, LoC authority file, other)
Not necessary:
- Cataloging rules like AACR2 (supported by two partners)
- Syntax of creator should be last name, first name in separate fields (supported by three partners)
- Renardus should provide authority files respective develop a home grown authority file
1.2.1.3.3 Creator: additional information (2.10 – 2.16)
Results of the questionnaire with regard to additional information of the creator field are:
Desirable:
- Additional information should be provided in extra Renardus database fields (supported by one partner)
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 19
- Email information of creator should be provided, URL of creator (e.g. homepage) should be provided,
Organizational information of creator should be provided (each part is supported by three partners)
Not necessary:
- Additional information should be provided in one Renardus database field, separated by special characters
(supported by one partners)
- Address information of creator should be provided in form of vCard (none partner support this)
1.2.1.4 Description (3)
1.2.1.4.1 Description: general (3.1)
It is strongly recommended that the description field is repeatable in case the description is provided in more
than one language. Some subject gateways provide the description beside in English also in their native
language (e.g. NOVAGate, ZADI, FVL) (supported by four partners).
1.2.1.4.2 Description: description + keywords (3.2 – 3.5)
This part of the questionnaire was important for cross-search issues. Partners were asked how strong they
evaluate that subject gateways must provide description and/or keywords in English language.
Recommended:
- Each SG should provide either an English version of description or an English version of keywords for
every resource (beside other languages) (supported by seven partners)
Desirable:
- Each SG should provide an English version of keywords for every resource (beside other languages)
(supported by five partners)
- Each SG should provide an English version of description for every resource (beside other languages)
(supported by five partners)
Not necessary
- Each SG should provide an English version of description and an English version of keywords for every
resource (beside other languages) (supported by four partners)
1.2.1.4.3 Description: multilinguality (3.6)
In case no English description is provided by a SG it is not necessary to have an automatic translation of the
main words of the description into English by the Renardus system, but for three of eight partners this will be
desirable in the future.
1.2.1.5 Subject (4)
This chapter summarizes results of questions related to keywords as well as classification systems.
1.2.1.5.1 Subject: keywords – general (4.1 – 4.2)
It is recommended that keywords are browsable (condition: each SG must have its own keyword index). Only
for one partner this issue is not necessary, all other seven partners evaluate this question between strongly
recommended and desirable (supported by seven partners)
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 20
It is more or less strongly recommended that this field should be repeatable in case keywords (controlled lists,
thesaurus based, free keywords) are provided in several languages (supported by five partners).
1.2.1.5.2 Subject: form of keywords (4.3 – 4.7)
Strongly recommended/recommended:
- All forms of keywords (free, controlled, thesaurus based) should be provided
- The form of keywords should be indicated for the user, e.g. if he/she only wants to search for thesaurus
based keywords in his/her scientific area (supported by six partners)
- Repeatable field for each form of keywords in one language (several thesauri, controlled lists, free
keywords) (supported by four partners)
Not necessary/definitely not:
- Only controlled (home grown list and/or thesaurus based) keywords should be provided (no free keywords)
- Only thesaurus based keywords should be provided (no free keywords, no controlled lists)
1.2.1.5.3 Subject: keywords – multilinguality (4.8)
An automatic translation of keywords into English in case no English keywords are provided by a SG is
evaluated by four partners with desirable, one partner answers with not necessary and two partners with
definitely not. In general this issue will not be necessary in Renardus.
1.2.1.5.4 Subject: keywords – rules (4.9)
Partners were asked if they use rules for keywords, e.g. geographica, proper names. Most of all partners use
thesauri rules (DTV/NetLab, SUB: thesauri rules, BnF, FVL, DDB). ZADI uses also special thesauri for
subjects, objects, and geographical regions.
1.2.1.5.5 Subject: classification – general (4.10 – 4.15)
Required/strongly recommended:
- The SGs should be browsable via a common classification system
- Renardus should use an existing common classification system
Strongly recommended/recommended:
- The common classification system should be DDC (all partners map their system to DDC) (supported by
six partners)
Recommended/desirable:
- Renardus should construct a common classification system
Not necessary:
- The common classification system should be a home grown one (a construction of all the partners'
classification systems)
- The common classification system should be a general classification system, other than DDC (all partners
map their system to this general system)
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 21
1.2.1.5.6 Subject: classification system - cross-search with regard to a special subject classification (4.16 –
4.20)
Recommended:
- Verbal description of each notation should be indexed together with keywords, so users can search both:
Recommended/desirable:
- Besides the common classification system, Renardus should provide subject classification systems like
MSC, Ei: cross-search via notation
- Besides the common classification system, Renardus should provide subject classification systems like
MSC, Ei: cross-search via verbal description of the notation
Desirable:
- Besides the common system, Renardus should provide all other SG specific classification systems (local,
national): cross-search via verbal description of the notation
- Besides the common classification system, Renardus should provide all other SG specific classification
systems (local, national): cross-search via notation
1.2.1.5.7 Subject: classification systems – multilinguality (4.21)
It is strongly recommended by partners that the common classification system should be provided in several
European languages.
1.2.1.6 Identifier (5)
At the several Renardus meetings there were strong discussions about the handling of the field identifier e.g. in
case several URLs are provided for one resource. Some partners provide more than one URL if the resource has
e.g. several titles in different languages. On the other hand some partners stated that each record should have
only one unique URL according to the one to one principle. To get now a common view on this topic several
questions had to be answered by partners related to this topic. Furthermore there are open questions regarding
mirror or copied sites, how to handle them.
1.2.1.6.1 Identifier: general - regarding resources in several languages (5.1 – 5.2)
Recommended:
- Repeatable if the resource is provided in more than one language with different URLs (supported by five
partners)
- If repeatable this field should also be searchable by the Renardus system (supported by six partners)
1.2.1.6.2 Identifier: general - regarding mirrored/copied resources (5.3 – 5.5)
Strongly recommended/recommended:
- If this field is repeatable it should alsoalso be searchable by the Renardus system (supported by five
partners)
Desirable:
- Repeatable in the field identifier with a special Renardus scheme
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 22
Not necessary:
- Repeatable in DC.Relation (e.g. with a special Renardus scheme) (supported by two partners)
1.2.1.6.3 Identifier: Qualifier (5.6 – 5.9)
Recommended:
- Renardus should integrate URLs, ISBNs, ISSNs in Identifier fields with different qualifiers (supported by
six partners)
- Renardus should integrate URIs, PURLs, and URNs (supported by five respective six partners)
1.2.1.7 Language (6)
1.2.1.7.1 Language: general (6.1)
It is strongly recommended that the language field is repeatable in separate fields in case several languages are
provided (supported by six partners).
1.2.1.7.2 Language: code (6.2 – 6.4)
It is strongly recommended that Renardus should support the ISO Code 639, three letters (supported by six
partners) and not the ISO Code 639 (supported by four partners), two letters (not necessary). There is no need to
use other codes.
1.2.1.8 Country (7)
Although this element is no Dublin Core element partners decided to support this field. One of the open
questions was the definition of this field. The country code could reflect the country of the publisher or the
country in which the server is located. In the last sense, it would be possible for Renardus users to select or sort
hits after the European countries. Another possiblity would be to reduce the hits returned on a search by filtering
out a country; e.g. in case of duplicates of resources to select the nearest one.
1.2.1.8.1 Country: general (7.1 – 7.3)
Strongly recommended:
- The country code should reflect the publisher country (supported by seven partners)
Not necessary:
- The country code should reflect the server country (supported by two partners)
- Renardus should support both, publisher and server country (e.g. country with a Renardus scheme publisher
and another scheme server) (supported by two partners)
1.2.1.8.2 Country: code (7.4 – 7.5)
It is more or less strongly recommended by partners that the country code should be ISO Code 3166, two
letters (supported by six partners). There is no need to use another code.
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 23
1.2.1.9 Type (8)
Not all partners support this element and those partners, which support it, use different “controlled lists”, some
of them are Dublin Core based. To get a common view and handling on this field partners were asked several
questions.
1.2.1.9.1 Type: general (8.1 – 8.5)
Recommended:
- Renardus should develop a common list of types (controlled list)
- The common list of types should be based on the Dublin Core type list (supported by five partners)
Not necessary:
- The common list of types should be a home grown one (mixture of all types of partners SGs)
- The common list of types should be based on a type list other than Dublin Core (e.g. type list in MARC21,
in Germany: Working Group "Codes", etc. )
Five partners want to specify the common type list by "qualifiers/subcategories" like
document.theses.habilitation etc., three partners don’t want this.
1.2.2 Future Elements
With regard to future elements at the technical meeting in Bath (11. May) there was more or less the strong wish
from some partners to support further elements after the prototype test installation of Renardus.
1.2.2.1 Rights (9.1 – 9.7)
Recommended/Desirable:
- Renardus should support the Rights element in the future (supported by four partners)
- Renardus should support the Rights element in the sense of IPRs (SGs keep their copyrights of the metadata
records after they are gathered from the broker service (supported by four partners)
- Rights should contain information about access conditions/restrictions of the resource (e.g.
technical/software requirements, subscription information) (supported by four partners)
- Rights should contain copyright/IPR information of the resource (supported by three partners)
- Rights should be a repeatable element for different kinds of information (access conditions/restrictions,
subscription information, copyright, IPR, etc.)
- Rights should contain information about access conditions/restrictions negotiated by the SG (by the library
or institution maintaining the SG respectively) (supported by three partners)
- Renardus should use the element Rights with different qualifiers for different kinds of information
1.2.2.2 Publisher (10)
It is strongly recommended to support in future a publisher element (five partners evaluate this with required
and this field is supported by six partners).
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 24
1.2.3 Additional Elements
There are some partners who want to support in the future a DC.Relation element (SUB, BnF), DC.Format
element (SUB, DDB: there even might be format preferences for the display of different mime types, one
partner stated definitely not (FVL: the system will become too, omplicated) and one partner referred to the Bath
decision (DTV/NetLab). One partner (ZADI) mentioned some general interest to support additional elements in
the future.
This might be an issue that should be discussed new after the prototype installation of the Renardus broker.
1.2.4 Administrative Elements
For the administrative, separate database Renardus needs some further administrative metadata elements.
1.2.4.1 Subject Gateway ID (IV A)
It is strongly recommended that Renardus should support an element like Subject Gateway ID with the name
and URL of the SG, so the user can search only in special gateways.
1.2.4.2 Unique Record Number (IV B)
It is recommended that Renardus should support an element like a Unique Record Number as an unambiguous
Renardus identifier.
1.2.4.3 Record Creator (IV C)
It is not necessary that Renardus should information about the record creator (with last name, first name, Email,
organisation etc.).
1.2.4.4 SBIG ID (IV D)
It is more or less recommended that Renardus should support a SBIG ID (=Record source) with the syntax:
name of information provider/name of Subject Gateway:Internal ID of the record in the SG database. With this
SBIG ID it is possible to update a record from the SG database to the Renardus database.
1.2.4.5 Record Last Checked Date (IV E)
It is recommended respective desirable that Renardus should support something like a "Record Last Checked
Date" element, which informs about a date of the last verification or update of the metadata record.
1.2.4.6 Other (IV F)
FVL stated that aybe the participant gateways need an administrative field, which determines, whether is the
record suitable for Renardus purposes or not. DDB stated that we should consider whether there should be
separate sets representing the subject gateways, with elements describing their particular subject competences
(for instance expressed by DDC notations), thereby enabling the system to route the user queries. Other
elements might be system administrators etc.
1.3 Subject Gateways in the UK
One of the gateway initiatives associated with the Renardus project is the UK's Resource Discovery Network
(RDN). The RDN is a service funded by the Joint Information Systems Committee (JISC) of the UK higher
education funding councils with support from the Economic and Social Research Council (ESRC) and the Arts
and Humanities Research Board (AHRB). The RDN builds upon the experiences of the subject gateway activity
carried out under the JISC's Electronic Libraries (eLib) Programme.
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 25
The RDN provides resource discovery services through a network of Internet information gateways that are
clustered together in subject-based 'hubs' (see chapter 1.3.2). These are co-ordinated by a team based in the
JISC's DNER Office at King's College London and at UKOLN. The hubs are essentially independent service
providers who provide one or more Internet resource catalogues or gateways that can be accessed at a variety of
levels. In addition, hubs have also developed, and linked to, a wide range of other information and related
services (Dempsey, 2000, p. 19).
Furthermore, in the context of the JISC's concept of a Distributed National Electronic Resource (DNER), the
RDN hubs are being encouraged to provide additional service layers, brokering access to heterogeneous services
through protocols like Z39.50. These services are referred to as DNER Portals. Dempsey (2000, p. 19) has said,
in this context, that "the 'subject gateway' or resource catalogue is one component in a network of
communicating services which may be assembled to meet particular business and user needs."
In the RDN context, the contents of gateways can be accessed at a variety of levels:
- Individual gateways or Internet resource catalogues. Where hubs are comprised of more than one gateway,
each will have its own Web interface. For example, the BIOME hub, which covers subjects in the health
and life sciences, is made up of five distinct gateways. Each one has its own interface that allows searching
and browsing within that particular gateway.
- Hubs. Each RDN 'hub' will have an interface that allows for all of its component Internet resource
catalogues to be searched (and possibly browsed) together. For more information on RDN hubs, see chapter
1.3.2.
- The RDN. The RDN is responsible for providing an interface to all of the services developed by hubs,
including services that will be able to cross-search through the ResourceFinder all of the Internet resource
catalogues developed by RDN hubs.
The RDN hubs are independent service providers. They can (and do) use a wide variety of different software
types and metadata formats. In order to support the central services that are offered by the RDN, it is strongly
recommended that hubs are able to provide a minimum set of metadata that - as currently defined - is a sub-set
of the Dublin Core elements. The six elements (Title, Subject, Description, Type, Identifier and Language) are
defined (with brief content rules) in the RDN Cataloguing Guidelines (Day and Cliff, 2000).
In this distributed scenario, it is unlikely that all RDN hubs would have a common single view of the Renardus
data model. As new hubs (and Internet resource catalogues) become part of the RDN, it is possible that there
could be even more diversity.
1.3.1 RDN
Michael Day from the RDN support team at UKOLN filled in the D6.4 questionnaire. He pointed out (in an e-
mail of 14 August) that the answers/comments on the questionnaire were mainly his own, but were in part based
on the RDN Cataloguing Guidelines and other internal discussions. "Because the RDN is a federation of a
number of gateways it is difficult to say whether RDN "supports" anything specific in the questionnaire, now or
in the future. It is likely that parts of RDN will support some things, while the RDN as a whole may not. For
example, ROADS gateways can record v-card-type information about creators or administrators, but the RDN
ResourceFinder will not be able to search this. On the other hand, both the RDN and gateways will be certainly
interested in things like developing a common classification system for cross-browsing. Many of the replies are
fairly neutral ('desirable' or 'not necessary') because they are issues that have not been widely considered in an
RDN context, e.g. the repeatability of some fields, descriptions in multiple languages, etc. Also, RDN allows
gateways to do much their own thing and they do. Some (e.g. SOSIG) are based on ROADS, others (EEVL, the
new OMNI) are not. Some use ROADS templates, others use something more DC-like. The RDN mandatory
elements (Title, Subject, Description, Type, Identifier (URI), Language) are based on a subset of DC."
The RDN Cataloguing Guidelines define content rules for all fifteen DCMES elements. Definitions were taken
from the Reference Description of DCMES version 1.1. Schemes are used in four of the six 'minimum set'
elements.
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 26
- Title. No particular scheme is defined in the guidelines, although AACR2 practice as regard to
capitalisation and punctuation is recommended.
- Subject. The guidelines do not mandate the use of any particular subject scheme, but if a scheme is used, a
shortened version of the scheme should be added as a value qualifier.
- Description. No particular scheme is defined in the guidelines.
- Type. The guidelines suggest that resource type should be taken from either the draft list of Dublin Core
Types (Dublin Core Type Working Group, 1999) or the list of types defined by the RDN (Cliff, 2000).
- Identifier. If no value qualifier is present, the identifier must be an URI.
- Language. This should be a language code either based on the three letter codes defined in ISO 639-2:1998
or the two letter codes recommended by RFC 1766. If required, RDN may need to provide some conversion
tools to map between the two schemes.
All RDN Internet resource catalogues should be able to provide records broadly in accordance with these
general guidelines. They would be able, therefore, to support most of the eight elements defined in the Renardus
data model.
http://www.rdn.ac.uk/
1.3.2 Individual RDN hubs
The RDN does not specify the software and metadata formats in use by each of the hubs. Most use their own
metadata formats, although these tend to have some kind of relationship with ROADS/IAFA templates or the
DCMES. The following sections attempt to explain the metadata formats in use within each of the RDN's
current hubs, to note its relationship with the 'minimum set' recommended by the RDN itself, and to note content
standards in use where these have been published.
1.3.2.1 BIOME
The BIOME health and life sciences hub is currently made up of five separate gateways that cover health and
medicine (OMNI), animal health (VetGate), biological and biomedical science (BioResearch), the natural world
(Natural Selection) and agriculture, food and forestry (AgriFor). A new gateway for nursing, midwifery and
allied health professions (NMAHP) will soon be added. BIOME provides its own cataloguing rules based on the
RTNG resource description template structure (Gray, 2000). These include versions of all six of the RDN's
'minimum set' of elements ('Title', 'Add subject descriptor', 'Add keywords', 'Description', 'Category', 'Main URI'
and 'Main Language'), but also an element ('UK based') that will indicate whether the resource being described
is based in the UK.
- The type element ('Category') uses a scheme defined by BIOME.
- The language element ('Main Language') is left blank if English is the main language. Other languages are
entered according to the MARC three letter language code (based on ISO 639-2:1988).
- For the subject classification element ('Add subject descriptor'), the National Library of Medicine and the
Library of Congress classification schemes are used in OMNI, NMAHP, VetGate and BioResearch; the
Dewey Decimal Classification (DDC) scheme in AgriFor and Natural Selection. Controlled vocabulary
schemes ('Add keyword') in use within BIOME include Medical Subject Headings (MeSH) for OMNI and
BioResearch, MeSH and the RCN (Royal College of Nursing) thesaurus for NMAHP, the CAB thesaurus
for AgriFor and VetGate, and Library of Congress Subject Headings (LCSH) for Natural Selection.
http://www.biome.ac.uk/
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 27
1.3.2.2 EEVL
EEVL (the Edinburgh Engineering Virtual Library) is currently the RDN service that covers engineering. EEVL
uses its own metadata format of 22 attributes that includes five of the RDN 'minimum set' of elements ('Title',
'Classification', 'Description', 'Resource type' and 'URL'); i.e., all elements except 'Language' (MacLeod, Kerr
and Guyon, 1998, pp. 209-210). The subject classification scheme adopted by EEVL is an in-house scheme that
is loosely based on the Ei Classification Scheme developed by Engineering Information Inc. EEVL is part of a
hub that will expand to cover the mathematical sciences (MathGate) and computing (Computing). The
MathGate and Computing gateways are still under development.
http://www.eevl.ac.uk/
1.3.2.3 Humbul
The Humbul service covers the arts and humanities. The gateway has developed its own software and uses an
element set based on the Dublin Core. The service publishes some draft cataloguing guidelines, Describing and
cataloguing resources in Humbul that are broadly based on the RDN guidelines and AACR2 (Humbul, 2000).
Versions of all the RDN 'minimum set' of elements are 'required' elements, as are several other elements,
including 'Author' and 'Publisher'. The main subject scheme in use is the Library of Congress Subject Headings
(LCSH). Types are defined using the draft list of Dublin Core Types; the RDN-defined list of types and an
additional set of types defined by Humbul itself. The 'Language' element uses the three letter code defined in
ISO 639-2:1998.
http://www.humbul.ac.uk/
1.3.2.4 PSIgate
The PSIgate hub will cover the physical sciences. The service is still under development.
http://www.psigate.ac.uk/
1.3.2.5 SOSIG
The SOSIG service covers the social sciences, business and law. The gateway uses the ROADS software, and
resources are described using ROADS/IAFA templates. These include equivalents of all RDN 'minimum set'
elements ('Title', 'Subject-Descriptor'/'Subject-Descriptor-Scheme', 'Description', 'Category', 'URI' and
'Language'). The browse structure is based on the Universal Decimal Classification (UDC). A thesaurus
searching option is also available which uses a thesaurus derived from HASSET (the Humanities And Social
Sciences Electronic Thesaurus).
http://www.sosig.ac.uk/
2 DATA MODEL AND DATA FLOW
Very early in the discussion of a Renardus data model it was clear, that the data model should be based on
Dublin Core as far as possible. Only one Renardus element is neither a DC element nor a DC based element and
this is Country. All other elements and qualifiers (element refinement and value encoding scheme) are based on
Dublin Core where possible. In case no encoding scheme or refinement from Dublin Core can be used, the
definition is a Renardus qualifier. It is also part of this workpackage to develop a Renardus namespace with a
defined Renardus Metadata Element Set (RMES). The final Renardus application profile will be ready in June
2001 (the public deliverable of D6.5).
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 28
The Renardus broker will consist of the content databases (decentral: Z39.50) with the agreed eight elements
and two administrative elements and the Collection Level Description database.
In this report the content database is at first based on the data model for the prototype Renardus pilot system
(see 2.1) and later on, after test installation of the prototype, on the preliminary version of the data model for the
operational Renardus pilot system (see 2.2). The content database will contain the metadata records extracted
from the individual Service Providers databases in accordance with the Renardus data model.
The Collection Level Description database will contain information on collection description of each subject
gateway and the mapping tables (e.g. for DDC, probably also for Language, Type, or Country code) (see 2.3).
Cross-search, cross-browse and filter issues:
The main basic index will allow a search across the elements Title, Description and Subject. Therefore it is
necessary that firstly the Subject Gateways provide free text in the description field and not e.g. a URL and
secondly that the Subject Gateways deliver any kind of subject information. Up to now it is an open question if
DDC captions will also be included in the basic index.
The cross-browsing structure will be realized through a mapping of each partners’ classification system to the
Dewey Decimal Classification (DDC). The DDC element is mandatory.
With the elements Country, Language, and Type some filter processes are possible. Together with the element
Creator these elements could also be displayed in the result list.
Upgrade priority for partners’ metadata information:
In case keyword is not yet an element in partners’ datamodel for the normalization process it is in the first place
recommended to provide the element keyword.
For the future it is required that the title will be provided in the original version, other forms of title could be
given in the title.alternative field. It is still undecided if in the future it will be required to provide an English
version of the title, either in the Title field or in the Title.Alternative field.
Considering that all partners should support an element it is further recommended that all partners support the
country element. It seems to be easier to extract the country code from the domain of a URL than to support a
language code.
In conclusion, if partners have to upgrade their metadata informatio it is strongly recommended to include first
keywords, than country followed by type and language.
2.1 Data model for the prototype Renardus pilot system
The data model is mainly based on two Dublin Core documents:
[DCMES version 1.1] Dublin Core Metadata Element Set, Version 1.1: Reference Description,
http://purl.oclc.org/dc/documents/rec-dces-19990702.htm
[DCMES Qualifiers (2000-07-11)] Dublin Core Qualifiers, http://purl.org/dc/documents/rec/dcmes-
qualifiers-20000711.htm
Format of entries:
Name Name of Metadata field
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 29
Qualified DC name Qualified Dublin Core name
Namespace DCMES version 1.1, DCMES Qualifiers (2000-07-11) or
Renardus Metadata Element Set = RMES version 0.1
Refinement(s) Element Refinements used in Renardus: These qualifiers make the meaning of an
element narrower or more specific. A refined element shares the meaning of the
unqualified element, but with a more restricted scope
DC Encoding Scheme(s) These qualifiers identify schemes that aid in the interpretation of an element value.
These schemes include controlled vocabularies and formal notations or parsing
rules. A value expressed using an encoding scheme will thus be a token selected
from a controlled vocabulary (e.g., a term from a classification system or set of
subject headings) or a string formatted in accordance with a formal notation (e.g.,
"2000-01-01" as the standard expression of a date). If an encoding scheme is not
understood by a client or agent, the value may still be useful to a human reader
R Encoding Scheme(s) Renardus encoding scheme, see above
Form of Obligation In the Renardus data model the obligation can be: mandatory (M), strongly
recommended (R) or optional (O). Mandatory ensures that some of the elements are
always supported. An element with a mandatory obligation must have a value. The
strongly recommended and the optional elements should be filled with a value if the
information is appropriate to the given resource or provided by a Subject Gateway,
but if not, they can be left blank.
Repeatable Metadata field is repeatable: yes or no
LQ "LANG" Language Qualifier "LANG": to give information about the language of the content
of a metadata field (ISO Code 639, two letter), yes, no, or possible
DC Definition Dublin Core Definition of metadata field
DC Comment Dublin Core comments to this metadata field
R Definition Renardus definition of metadata field
R Comment Renardus comments to this metadata field
2.1.1 Dublin Core Elements
2.1.1.1 DC.Title and DC.Title.Alternative
Name
Title
Qualified DC name DC.Title
Namespace DCMES version 1.1
Refinement(s) Alternative
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 30
DC Encoding Scheme(s) none
R Encoding Scheme(s) -
Obligation
M
Repeatable no
LQ "LANG" possible
DC Definition A name given to the resource
DC Comment Typically, a title will be a name by which the resource is formally known
R Definition -
R Comment -
Name
Title ¦ Alternative
Qualified DC name DC.Title.Alternative
Namespace DCMES Qualifiers (2000-07-11)
Refinement(s) -
DC Encoding Scheme(s) none
R Encoding Scheme(s) -
Obligation
O
Repeatable yes
LQ "LANG" possible
DC Definition Any form of the title used as a substitute or alternative to the formal title of the
resource
DC Comment This qualifier can include Title abbreviations as well as translations
R Definition -
R Comment -
2.1.1.2 DC.Creator
Name Creator
Qualified DC name DC.Creator
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 31
Namespace DCMES version 1.1
Refinement(s) -
DC Encoding Scheme(s) -
R Encoding Scheme(s) For personal names: last name and first name in separate tags
Obligation
R
Repeatable yes
LQ "LANG" no
DC Definition An entity primarily responsible for making the content of the resource.
DC Comment Examples of a Creator include a person, an organisation, or a service. Typically, the
name of a Creator should be used to indicate the entity.
R Definition Creator(s) are person(s) which are responsible for the intellectual content of the
document(s), e.g. webmasters are no creators.
R Comment If this field is applicable it is strongly recommended to provide the creator.
For Renardus normalization process it is strongly recommended that last name and
first name are clearly distinguishable.
2.1.1.3 DC.Description
Name
Description
Qualified DC name DC.Description
Namespace DCMES version 1.1
Refinement(s) -
DC Encoding Scheme(s) none
R Encoding Scheme(s) -
Obligation
M
Repeatable yes
LQ "LANG" possible
DC Definition An account of the content of the resource.
DC Comment Description may include but is not limited to: an abstract, table of contents,
reference to a graphical representation of content or a free-text account of the
content.
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 32
R Definition -
R Comment For the Renardus normalization process it is not enough to provide only a URL, for
cross-search reasons the field description must contain free text.
2.1.1.4 DC.Subject: classification system(s) and keywords
Name
Subject
Qualified DC name DC.Subject
Namespace DCMES Qualifiers (2000-07-11) and RMES version 0.1
Refinement(s) -
DC Encoding Scheme(s) LCSH, MESH, DDC, LCC, UDC
R Encoding Scheme(s) all other encoding schemes used by the partners
Obligation
M
Repeatable yes
LQ "LANG" possible
DC Definition The topic of the content of the resource.
DC Comment Typically, a subject will be expressed as keywords, key phrases or classification
codes that describe a topic of the resource. Recommended best practice is to select a
value from a controlled vocabulary or formal classification scheme.
R Definition -
R Comment Here is the place for all subject information used by partners like controlled
keywords, free keywords, classification system(s) and/or captions. In the prototype
system there will be no further distinction between the several kinds of subject.
In the prototype system the provision of keywords is strongly recommended, in the
final system the provision of keywords is required.
Name
Subject ¦ DDC
Qualified DC name DC.Subject
Namespace DCMES Qualifiers (2000-07-11) and RMES version 0.1
Refinement(s) -
DC Encoding Scheme(s) DDC
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 33
R Encoding Scheme(s) Ren-DDC for normalization, DDC 21 can be extend by RENARDUS specific
captions
Obligation
M
Repeatable yes
LQ "LANG"
no
DC Definition Dewey Decimal Classification, see also: http://www.oclc.org/dewey/index.htm
DC Comment -
R Definition DDC 21: adapted DDC version for cross-browsing puporse.
R Comment This field is created in the Renardus normalization process via mapping tables from
the particular Subject Gateway classification scheme. Each partner has to map the
own classification system to DDC. Mapping guideline for DDC will be prepared in
the context of WP 7.
Only captions and not notations will be displayed.
2.1.1.5 DC.Identifier
Name
Identifier
Qualified DC name DC.Identifier
Namespace DCMES version 1.1
Refinement(s) -
DC Encoding Scheme(s) URI
R Encoding Scheme(s) -
Obligation
M
Repeatable yes, for translated sites and/or mirrored, copied sites
LQ "LANG" no
DC Definition An unambiguous reference to the resource within a given context.
DC Comment Recommended best practice is to identify the resource by means of a string or
number conforming to a formal identification system. Example formal identification
systems include the Uniform Resource Identifier (URI) (including the Uniform
Resource Locator (URL)), the Digital Object Identifier (DOI) and the International
Standard Book Number (ISBN)..
R Definition -
R Comment URI means URL, URN, DOI, ISBN, ISSN etc. For Renardus normalization process
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 34
DOI, ISBN und ISSN must be displayed in a URN syntax.
In the prototype system no distinction will be made between resource URL,
mirrored, copied resource URL(s) and URL(s) for archive reasons.
2.1.1.6 DC.Language
Name
Language
Qualified DC name DC.Language
Namespace DCMES version 1.1
Refinement(s) -
DC Encoding Scheme(s) ISO 639-2
R Encoding Scheme(s) -
Obligation
R
Repeatable yes
LQ "LANG" -
DC Definition A language of the intellectual content of the resource.
DC Comment Recommended best practice for the values of the Language element is defined by
RFC 1766 which includes a two-letter Language Code (taken from the ISO 639
standard), followed optionally, by a two-letter Country Code (taken from the ISO
3166 standard). For example, en for English, fr for French, or en-uk for English used
in the United Kingdom
R Definition -
R Comment The language code is the ISO 639-2, three letter code. SUB will provide a mapping
between the two letter and three letter language code but this will also be found on
the LoC site – ISO 639-2: http://lcweb.loc.gov/standards/iso639-2/englangn.html
2.1.1.7 DC.Type
Name
Type ¦ DCMI Type (DCT1)
Qualified DC name DC.Type
Namespace DCMES Qualifiers (2000-07-11)
Refinement(s) -
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 35
DC Encoding Scheme(s) DCMI Type Vocabulary (DCT1)
R Encoding Scheme(s) -
Obligation
R
Repeatable yes
LQ "LANG" no
DC Definition The nature or genre of the content of the resource.
DC Comment Type includes terms describing general categories, functions, genres, or aggregation
levels for content. Recommended best practice is to select a value from a controlled
vocabulary (for example, the list of DCMI Types). To describe the physical or
digital manifestation of the resource, use the Format element.
R Definition -
R Comment SUB will provide a mapping of all types used in partners’ subject gateways to DCT1
(probably except of ZADI). The possibility and usability of a mapping to DCT2 will
be investigated in the context of WP 7.
Name
Type
Qualified DC name DC.Type
Namespace DCMES version 1.1
Refinement(s) -
DC Encoding Scheme(s) -
R Encoding Scheme(s) -
Obligation
R
Repeatable yes
LQ "LANG" no
DC Definition The nature or genre of the content of the resource.
DC Comment Type includes terms describing general categories, functions, genres, or aggregation
levels for content. Recommended best practice is to select a value from a controlled
vocabulary (for example, the list of DCMI Types). To describe the physical or
digital manifestation of the resource, use the Format element.
R Definition -
R Comment Subject Gateways should provide their original types without encoding scheme.
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 3
6
2.1.2 Non Dublin Core element
2.1.2.1 Country
Name
Country
Qualified DC name -
Namespace RMES version 0.1
Refinement(s) -
DC Encoding Scheme(s) -
R Encoding Scheme(s) ISO 3166-1 (two letter code)
http://www.din.de/gremien/nas/nabd/iso3166ma/
Obligation
R
Repeatable no
LQ "LANG" no
DC Definition -
DC Comment -
R Definition Country in which the publisher of the resource is located or the country which
represents the cultural context of the resource. Code for the representation of names
of countries.
R Comment -
2.1.3 Administrative Renardus elements
Two administrative elements are used in Renardus for practical reasons: “Full Record ID” and “SBIG ID”.
2.1.3.1 Full Record URL
Name
Full Record URL
Qualified DC name -
Namespace RMES version 0.1
Refinement(s) -
DC Encoding
Scheme(s)
-
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 3
7
R Encoding Scheme(s) URL
Obligation
R
Repeatable no
LQ "LANG" no
DC Definition -
DC Comment -
R Definition A URL that leads to a detailed display of each record at the originating service site.
R Comment Because some partners generate their records dynamically it might be a problem to
provide a URL to the full record display.
2.1.3.2 SBIG ID
Name
SBIG ID
Qualified DC name -
Namespace RMES version 0.1
Refinement(s) -
DC Encoding Scheme(s) -
R Encoding Scheme(s) Acronym of Subject Gateway
Obligation
M
Repeatable no
LQ "LANG" no
DC Definition -
DC Comment -
R Definition A stable unique acronym also well defined in the Collection Level Description.
R Comment Must be the same acronym as used in the Renardus Collection Level Description
schema field “Acronym”.
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 38
2.2 Preliminary version of data model for the operational Renardus pilot system
This data model refleccts the current status of discussion. It is likely that there will be some changes e.g. with
regard to obligation of an element, further qualifiers, additions in future e.g. with regard to support further
elements like publisher, rights, format and relation, and some mor comments.
In opposite to the data model for the prototype system this preliminary data model contains further qualifiers,
some more language tags for the elements and some changes in the obligation of an element.
The data model is mainly based on two Dublin Core documents:
[DCMES version 1.1] Dublin Core Metadata Element Set, Version 1.1: Reference Description,
http://purl.oclc.org/dc/documents/rec-dces-19990702.htm
[DCMES Qualifiers (2000-07-11)] Dublin Core Qualifiers, http://purl.org/dc/documents/rec/dcmes-
qualifiers-20000711.htm
Format of entries:
Name Name of Metadata field
Qualified DC name Qualified Dublin Core name
Namespace DCMES version 1.1, DCMES Qualifiers (2000-07-11) or
Renardus Metadata Element Set = RMES version 0.1
Refinement(s) Element Refinements used in Renardus: These qualifiers make the meaning of an
element narrower or more specific. A refined element shares the meaning of the
unqualified element, but with a more restricted scope
DC Encoding Scheme(s) These qualifiers identify schemes that aid in the interpretation of an element value.
These schemes include controlled vocabularies and formal notations or parsing
rules. A value expressed using an encoding scheme will thus be a token selected
from a controlled vocabulary (e.g., a term from a classification system or set of
subject headings) or a string formatted in accordance with a formal notation (e.g.,
"2000-01-01" as the standard expression of a date). If an encoding scheme is not
understood by a client or agent, the value may still be useful to a human reader
R Encoding Scheme(s) Renardus encoding scheme, see above
Form of Obligation In the Renardus data model the obligation can be: mandatory (M), strongly
recommended (R) or optional (O). Mandatory ensures that some of the elements are
always supported. An element with a mandatory obligation must have a value. The
strongly recommended and the optional elements should be filled with a value if the
information is appropriate to the given resource or provided by a Subject Gateway,
but if not, they can be left blank.
Repeatable Metadata field is repeatable: yes or no
LQ "LANG" Language Qualifier "LANG": to give information about the language of the content
of a metadata field (ISO Code 639, two letter), yes or no
DC Definition Dublin Core Definition of metadata field
DC Comment Dublin Core comments to this metadata field
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 39
R Definition Renardus definition of metadata field
R Comment Renardus comments to this metadata field
2.2.1 Dublin Core Elements
2.2.1.1 DC.Title and DC.Title.Alternative
Name
Title
Qualified DC name DC.Title
Namespace DCMES version 1.1
Refinement(s) Alternative
DC Encoding Scheme(s) none
R Encoding Scheme(s) -
Obligation
M
Repeatable no
LQ "LANG" yes
DC Definition A name given to the resource
DC Comment Typically, a title will be a name by which the resource is formally known
R Definition Title should be the original title, other forms of title should be provided in the Title.
Alternative field.
R Comment It is strongly recommended to provide only one version of title in this field (and not
also e.g. translated titles).
Name
Title ¦ Alternative
Qualified DC name DC.Title.Alternative
Namespace DCMES Qualifiers (2000-07-11)
Refinement(s) -
DC Encoding Scheme(s) none
R Encoding Scheme(s) -
Obligation
O
Repeatable yes
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 40
LQ "LANG" yes
DC Definition Any form of the title used as a substitute or alternative to the formal title of the
resource
DC Comment This qualifier can include Title abbreviations as well as translations
R Definition -
R Comment -
2.2.1.2 DC.Creator and DC.Creator.AddinionalInformation
Name
Creator
Qualified DC name DC.Creator
Namespace DCMES version 1.1
Refinement(s) -
DC Encoding Scheme(s) none
R Encoding Scheme(s) For personal names: last name, first name in separate tags
Obligation
R
Repeatable yes
LQ "LANG" no
DC Definition An entity primarily responsible for making the content of the resource.
DC Comment Examples of a Creator include a person, an organisation, or a service. Typically, the
name of a Creator should be used to indicate the entity.
R Definition Creator(s) are person(s) which are responsible for the intellectual content of the
document(s), e.g. webmasters are no creators.
R Comment If this field is applicable it is strongly recommended to provide the creator.
For Renardus normalization process it is strongly recommended that last name and
first name are clearly distinguishable.
It is not yet clear if the Renardus datamodel will support the refinement “Additional Information” of creator.
This dependes also on the agent discussion of Dublin Core and how DC will support this kind of information in
future.
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 41
- Formal for each kind of “Additional Information” like Email, URL and Organizational Information an extra
definition table sheet -
Name
Creator ¦ AdditionalInformation
Qualified DC name (see Agent discussion:
http://www.mailbase.ac.uk/lists/dc-agents/files/wd-agent-qual.html)
Namespace RMES version 0.1
Refinement(s) RMES version 0.1 (for Additional Information)
DC Encoding Scheme(s) (see Agent discussion:
http://www.mailbase.ac.uk/lists/dc-agents/files/wd-agent-qual.html)
R Encoding Scheme(s) Email, URL, OrgInf
Obligation
O
Repeatable yes
LQ "LANG" no
DC Definition -
DC Comment -
R Definition Additional information like Email, URL, Organisational Information with regard to
creator.
R Comment -
2.2.1.3 DC.Description
Name
Description
Qualified DC name DC.Description
Namespace DCMES version 1.1
Refinement(s) -
DC Encoding Scheme(s) none
R Encoding Scheme(s) -
Obligation
M
Repeatable yes
LQ "LANG" yes
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 42
DC Definition An account of the content of the resource.
DC Comment Description may include but is not limited to: an abstract, table of contents,
reference to a geographical representation of content or a free-text account of the
content.
R Definition -
R Comment For the Renardus normalization process it is not enough to provide only a URL, for
cross-search reasons the field description must contain free text.
Strongly recommended: Each SG should provide either an English version of
description or an English version of keywords for every resource (beside other
languages)
2.2.1.4 DC.Subject: classification system(s) and keywords
- Formal for each partners’classification system (captions and notations of thematic, subject, general, or local
classification: FAO/AGRIS, Ei, NLM, BK etc.), each kind of keywords (thesauri based and/or controlled
keywords, free keywords: AGROVOC Thesaurus, AGRIFOREST, Danish Agricultural Thesaurus, Ei
Thesaurus, GEFO Thesaurus, HASSET Thesaurus, CAREDATA, IBSS Thesaurus, Thesaurus of Geoscience,
Geo Ref Thesaurus etc.) and each DC encoding scheme an extra definition table sheet -
Name
Subject
Qualified DC name DC.Subject
Namespace DCMES Qualifiers (2000-07-11) and RMES version 0.1
Refinement(s) -
DC Encoding Scheme(s) LCSH, MeSH, DDC, LCC, UDC
R Encoding Scheme(s) all other encoding schemes used by the partners
Obligation
M
Repeatable yes
LQ "LANG" yes
DC Definition The topic of the content of the resource.
DC Comment Typically, a subject will be expressed as keywords, key phrases or classification
codes that describe a topic of the resource. Recommended best practice is to select a
value from a controlled vocabulary or formal classification scheme.
R Definition -
R Comment Here is the place for all subject information used by partners like controlled
keywords, free keywords, classification system(s) and/or captions. In the
preliminary version of data model for the operational Renardus pilot there will be
made a distinction between the several kinds of subject.
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 43
For the final system the provision of keywords is required.
Name
Subject ¦ DDC
Qualified DC name DC.Subject
Namespace DCMES Qualifiers (2000-07-11) and RMES version 0.1
Refinement(s) -
DC Encoding Scheme(s) DDC
R Encoding Scheme(s) Ren-DDC for normalization, DDC 21 can be extend by RENARDUS specific
captions
Obligation
M
Repeatable yes
LQ "LANG"
no
DC Definition Dewey Decimal Classification, see also: http://www.oclc.org/dewey/index.htm
DC Comment -
R Definition DDC 21: adapted DDC version for cross-browsing puporse.
R Comment This field is created in the Renardus normalization process via mapping tables from
the particular Subject Gateway classification scheme. Each partner has to map the
own classification system to DDC. Mapping guideline for DDC will be prepared in
the context of WP 7.
Only captions and not notations will be displayed.
2.2.1.5 DC.Identifier
Name
Identifier
Qualified DC name DC.Identifier
Namespace DCMES Qualifiers (2000-07-11) and RMES version 0.1
Refinement(s) Mirror, Archive
DC Encoding Scheme(s) URI
R Encoding Scheme(s) -
Obligation
M
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 44
Repeatable yes, for translated sites
LQ "LANG" no
DC Definition An unambiguous reference to the resource within a given context.
DC Comment Recommended best practice is to identify the resource by means of a string or
number conforming to a formal identification system. Example formal identification
systems include the Uniform Resource Identifier (URI) (including the Uniform
Resource Locator (URL)), the Digital Object Identifier (DOI) and the International
Standard Book Number (ISBN)..
R Definition -
R Comment URI means URL, URN, DOI, ISBN, ISSN etc. For Renardus normalization process
DOI, ISBN und ISSN must be displayed in a URN syntax.
In the preliminary version of data model for the operational Renardus pilot system
there will be made a distinction between resource URL, mirrored, copied resource
URL(s) and URL(s) for archive reasons.
Name
Identifier ¦ Mirror
Qualified DC name DC.Identifier
Namespace RMES version 0.1
Refinement(s) Mirror
DC Encoding Scheme(s) URI
R Encoding Scheme(s) -
Obligation
O
Repeatable yes
LQ "LANG" no
DC Definition An unambiguous reference to the resource within a given context.
DC Comment Recommended best practice is to identify the resource by means of a string or
number conforming to a formal identification system. Example formal identification
systems include the Uniform Resource Identifier (URI) (including the Uniform
Resource Locator (URL)), the Digital Object Identifier (DOI) and the International
Standard Book Number (ISBN).
R Definition -
R Comment URI means URL, URN, DOI, ISBN, ISSN etc. For Renardus normalization process
DOI, ISBN und ISSN must be displayed in a URN syntax.
Name
Identifier ¦ Archiv
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 45
Qualified DC name DC.Identifier
Namespace RMES version 0.1
Refinement(s) Archiv
DC Encoding Scheme(s) URI (? to ask DDB)
R Encoding Scheme(s) -
Obligation
O
Repeatable no
LQ "LANG" no
DC Definition An unambiguous reference to the resource within a given context.
DC Comment Recommended best practice is to identify the resource by means of a string or
number conforming to a formal identification system. Example formal identification
systems include the Uniform Resource Identifier (URI) (including the Uniform
Resource Locator (URL)), the Digital Object Identifier (DOI) and the International
Standard Book Number (ISBN).
R Definition -
R Comment -
2.2.1.6 DC.Language
Name
Language
Qualified DC name DC.Language
Namespace DCMES version 1.1
Refinement(s) -
DC Encoding Scheme(s) ISO 639-2
R Encoding Scheme(s) -
Obligation
R
Repeatable yes
LQ "LANG" -
DC Definition A language of the intellectual content of the resource.
DC Comment Recommended best practice for the values of the Language element is defined by
RFC 1766 which includes a two-letter Language Code (taken from the ISO 639
standard), followed optionally, by a two-letter Country Code (taken from the ISO
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 4
6
3166 standard). For example, en for English, fr for French, or en-uk for English used
in the United Kingdom
R Definition -
R Comment The language code is the ISO 639-2, three letter code. SUB will provide a mapping
between the two letter and three letter language code but this will also be found on
the LoC site – ISO 639-2: http://lcweb.loc.gov/standards/iso639-2/englangn.html
2.2.1.7 DC.Type
Name
Type ¦ DCMI Type (DCT1)
Qualified DC name DC.Type
Namespace DCMES Qualifiers (2000-07-11)
Refinement(s) -
DC Encoding Scheme(s) DCMI Type Vocabulary (DCT1)
R Encoding Scheme(s)
Obligation
R
Repeatable yes
LQ "LANG" no
DC Definition The nature or genre of the content of the resource.
DC Comment Type includes terms describing general categories, functions, genres, or aggregation
levels for content. Recommended best practice is to select a value from a controlled
vocabulary (for example, the list of DCMI Types). To describe the physical or
digital manifestation of the resource, use the Format element.
R Definition -
R Comment SUB will provide a mapping of all types used in partners’ subject gateways to DCT1
(probably except of ZADI).
Name
Type ¦ DCMI Type (DCT2)
Qualified DC name DC.Type
Namespace DCT2: Dublin Core Type Vocabulary: Subtypes Working Draft,
http://lcweb.loc.gov/marc/dc/subtypes-20000928.html
Refinement(s) -
DC Encoding Scheme(s) DCMI Type Vocabulary (DCT2) as soon as it is fixed!
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 4
7
R Encoding Scheme(s) -
Obligation
O
Repeatable yes
LQ "LANG" no
DC Definition The nature or genre of the content of the resource.
DC Comment Type includes terms describing general categories, functions, genres, or aggregation
levels for content. Recommended best practice is to select a value from a controlled
vocabulary (for example, the list of DCMI Types). To describe the physical or
digital manifestation of the resource, use the Format element.
R Definition A list of subtypes used to categorize the nature or genre of the content of the
resource, a more specific list of resource types than available in the DCT1 Type
Vocabulary.
R Comment The possibility and usability of a mapping to DCT2 will be investigated in the
context of WP 7.
Name
Type
Qualified DC name DC.Type
Namespace DCMES version 1.1
Refinement(s) -
DC Encoding Scheme(s) none
R Encoding Scheme(s) -
Obligation
R
Repeatable yes
LQ "LANG" no
DC Definition The nature or genre of the content of the resource.
DC Comment Type includes terms describing general categories, functions, genres, or aggregation
levels for content. Recommended best practice is to select a value from a controlled
vocabulary (for example, the list of DCMI Types). To describe the physical or
digital manifestation of the resource, use the Format element.
R Definition -
R Comment Subject Gateways should provide their original types without encoding scheme.
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 48
2.2.2 Non Dublin Core element
2.2.2.1 Country
Name
Country
Qualified DC name -
Namespace RMES version 0.1
Refinement(s) -
DC Encoding Scheme(s) none
R Encoding Scheme(s) ISO 3166-1 (two letter code)
http://www.din.de/gremien/nas/nabd/iso3166ma/
Obligation
R
Repeatable no
LQ "LANG" no
DC Definition -
DC Comment -
R Definition Country in which the publisher of the resource is located or the country which
represents the cultural context of the resource. Code for the representation of names
of countries.
R Comment -
2.2.3 Administrative Renardus elements
Two administrative elements are used in Renardus for practical reasons: “Full Record ID” and “SBIG ID”.
2.2.3.1 Full Record URL
Name
Full Record URL
Qualified DC name -
Namespace RMES version 0.1
Refinement(s) -
DC Encoding
Scheme(s)
none
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 49
R Encoding Scheme(s) URL
Obligation
R
Repeatable no
LQ "LANG" no
DC Definition -
DC Comment -
R Definition A URL that leads to a detailed display of each record at the originating service site.
R Comment Because some partners generate their records dynamically it might be a problem to
provide a URL to the full record display.
2.2.3.2 SBIG ID
Name
SBIG ID
Qualified DC name -
Namespace RMES version 0.1
Refinement(s) -
DC Encoding Scheme(s) none
R Encoding Scheme(s) Acronym of Subject Gateway
Obligation
M
Repeatable no
LQ "LANG" no
DC Definition -
DC Comment -
R Definition A stable unique acronym also well defined in the Collection Level Description.
R Comment Must be the same acronym as used in the Renardus Collection Level Description
schema field “Acronym”.
2.3 Data model of the administrative database: Collection Level Description (CLD)
In the administrative database the participating Subject Gateways and brokers will make available collection
management descriptions and mapping tables for DDC. Each Renardus participant is responsible for
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 50
maintaining and offering information about their collection on a local server and providing the mapping tables
from their local classification system(s) to the agreed classification system DDC.
The part of Renardus collection description data model of the administrative database is based on the RSLP
Collection Description Schema. Collection description is conform to the RSLP schema with some additional
element. A syntax and some content rules for the partners’ Collection Level Description will be provided in due
time.
Three kinds of elements are used:
- Dublin Core (based) elements (e.g. dc:title)
- Collection Level Description elements based on RSLP schema (e.g. cld:country)
- Renardus specific Collection Level Description elements (e.g. ren-cld:language)
All elements except of DC.Relation are mandatory. A guideline for DC.Description will be developed in the
context of D6.5 (delivered on 30. June 2001) with the goal to have a more or less standardized form of
description.
The aims of the collection description are:
- to support the selection of subject gateway(s) for searching
- to provide background information about the participating subject gateway for human and machine users
- to promote/register the individual subject gateway(s) as high quality resources in the Internet
Renardus Collection Level Description
Attribute RDF property Definition
Dublin Core (based) elements:
Title
dc:title
The name of the collection.
Identifier
dc:identifier
An unambiguous reference to the collection within
a given context (encoding scheme: URI).
Description
dc:description
An account of the content of the collection.
Comment: Renardus will provide a standardized
structure of the content of description with
information about granularity of collected
resources, type of subject indexing, etc. in context
of D6.5.
Language
dc:language
The main language(s) of the metadata in the
collection with quantitative indication.
Syntax: Free text.
Publisher
dc:publisher
An entity responsible for making the collection
available.
Comment: The organization etc. who is responsible
for the intellectual (not technical) distribution of
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 51
the collection.
Format.Extent
dc:format
dcq:extent
The size of the collection.
Comment: It is recommended to provide the
number of records as follows: about x records.
Date.Issued
dc:date
dcq:issued
Date of formal iisuance (e.g. publication) of the
collection.
Subject
dc:subject
The topic of the content of the collection.
Syntax: Main DDC captions for the subjects
represented in the Subject Gateway.
Subject Notation
dc:subject
The topic of the content of the collection.
Syntax: Main DDC notations and captions for the
subjects represented in the Subject Gateway: DDC
notation1 – DDC caption1; DDC notation2 – DDC
caption2 etc.
Comment: Element content not displayed in human
readable Collection Level Descriptions.
Relation
dc:relation
dcq:hasPart
dcq:isPartOf
A reference to a related resource.
Syntax: Acronym followed by empty character
must precede other describing text for every
related subject gateway.
Comment: At the moment only used by RDN and
its member Subject Gateways.
Collection Level Description elements based on RSLP schema:
Country
cld:country
The country in which the collection is physically
located.
Syntax: Free text.
Renardus specific Collection Level Description elements:
Acronym
ren-cld:acronym
The acronym of the collection.
Resource Language
ren-cld:language
Language(s) of the described resources.
Syntax: Free text.
DDC mapping URL
ren-cld:ddcMapping
URL of local DDC mapping information in
Renardus format.
Comment: Element content not displayed in human
readable Collection Level Descriptions.
Z39.50 Location
ren-cld:Z3950Location
The online location of the Z39.50 server of the
subject gateway
Syntax: machine name; port number; database
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 52
name
Comment: Element content not displayed in human
readable Collection Level Descriptions.
Logo URL
ren-cld:logoURL
The URL of the logo (image) of the subject
gateway.
Comment: Element content not displayed in human
readable Collection Level Descriptions.
2.4 Data flow
The data flow does not solely depend on the chosen data model but also on other aspects. For example,
organizational and business issues as well as the gateway-to-server structures which the participants will choose
are of importance in this context. All these matters are being studied and developed in the current Renardus
work. WP3 develops organizational structures for the management of the Renardus service and for collaboration
between the participants, WP8 investigates business issues which have impact on Renardus (e.g. Intellectual
Property Rights, copyright). Also, interoperability issues (WP7) will influence the Renardus data flow.
A first approach to data flow can therefore be only a general one, based on the Renardus architectural model
(see http://www.konbib.nl/coop/reynard/restricted/architecture2.ppt).
For Renardus a distributed system architecture has been chosen (see D2.2 and D2.3). Each participant or group
of participants will be required to set up and maintain a Renardus server which will contain a Renardus content
database and an administrative database.
In order to make data from the participant gateways available and usable in Renardus a normalization process is
needed. Data from all participants have to be harmonized. The question is at what step the
normalization/harmonization process will be done. It is also of importance to the data flow whether the
particular Renardus server holds the data of one single service or of a group of participating services.
The structures underlying the different participating services are heterogeneous. In some cases there is one
gateway involved (e.g. DutchESS, DAINet). In others there are distributed broker services involved (RDN) with
differently structured records (e.g. RDN’s SOSIG or EEVL) or several gateways with uniform structures held by
one institution (e.g. SSG-FI with its four subject guides).
In case of a single service the service extracts the relevant data from its database, normalizes them to be
conform with the agreed upon data model, and imports the data into the single Renardus server.
Where a group of services chooses to maintain one joint Renardus server, each service has to extract and
normalize its data in the appropriate way before exporting the data to the joint Renardus server. These
conversion processes will most likely be different in that the record structure of the different services will not be
the same.
Also the methods of exporting and importing might be different for the individual services.
Normalization can occur before a service’s exporting its relevant records or after importing them to the
Renardus server.
Several steps are needed to get the metadata from a Subject Gateway into the Renardus broker. A suggested
model for partners to make their content available in a local single Renardus server is described in D2.2 resp.
D2,3:
- to extract the appropriate records from the database
- Record conversion/normalization process
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 53
- to write the necessary configuration files
- to run the Zebra indexer on the record/files generated and to start the Zebra server
Except of writing configuration files these steps has to be repeated each time in case of refreshing the content of
the metadata.
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 54
PART IV – REMAINDER
APPENDIX
3 APPENDIX A: QUESTIONNAIRE
Renardus questionnaire D6.4: Data model and data flow
(http://www.sub.uni-goettingen.de/ssgfi/reynard/wp6/d6.4/questionnaires/all.html)
4 APPENDIX B: RESPONSES
Questionnaire: Responses from the partners
(http://www.sub.uni-goettingen.de/ssgfi/reynard/wp6/d6.4/index.html)
ALUH: http://www.sub.uni-goettingen.de/ssgfi/reynard/wp6/d6.4/novagate.pdf
BNF: http://www.sub.uni-goettingen.de/ssgfi/reynard/wp6/d6.4/bnf.pdf
DDB: http://www.sub.uni-goettingen.de/ssgfi/reynard/wp6/d6.4/ddb.pdf
DTV and NetLab: http://www.sub.uni-goettingen.de/ssgfi/reynard/wp6/d6.4/dtv_netlab.pdf
JyU: http://www.sub.uni-goettingen.de/ssgfi/reynard/wp6/d6.4/fvl.pdf
KB: http://www.sub.uni-goettingen.de/ssgfi/reynard/wp6/d6.4/dutchess.pdf
SOSIG: http://www.sub.uni-goettingen.de/ssgfi/reynard/wp6/d6.4/sosig.pdf
SUB: http://www.sub.uni-goettingen.de/ssgfi/reynard/wp6/d6.4/sub.pdf
UKOLN: http://www.sub.uni-goettingen.de/ssgfi/reynard/wp6/d6.4/rdn.pdf
ZADI: http://www.sub.uni-goettingen.de/ssgfi/reynard/wp6/d6.4/zadi.pdf
5 APPENDIX C: COMMENTS OF PARTNERS
General (0)
DutchESS: I think those elements are the bare minimum required to support Renardus functionality. The other
ones are important and should preferable be supported, but not supporting them is no reason to exclude
gateways. Gateways that don't support these elements can not be included in searches based on advanced search
functionality but as it is known from research that c. 90% of searches is simple search in all fields anyway, I
don't think this matters much
DTV/NetLab: only one of the subject fields is needed. A SBIG should support at least 6-7 of the 8 elements
BnF: we have to define the content of the creator field
FVL: All those elements are important
DDB: Mime type or document type?
Title/Title.Alternative (1.1 – 1.6)
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 55
DutchESS: DutchESS puts titles in various languages in the same title field, separated by "=". I suppose these
various versions could be exported to different Renardus title fields by using this "=" separator. In that case we
would be able to support some of the above options. Those titles could be exported to one title field and a
number of alternative title fields or to more than one title field. In that way we could support either repeatable or
non repeatable title and alt. title fields Regarding Title/Title.Alternative field: - either have a not repeatable title
field and a repeatable title.alt field OR - have a repeatable title field and no title.alt field.
DTV/NetLab: 1.1: As we mentionend in a previous mail we are unclear about is to what "repeatable" actually
means in the context of the questions – in Renardus or locally in the SG and how this ultimately effects
functionality in the service.(Doyle 28/06) are unclear about as to what » repeatable« actually means in the
context of the questions - in Renardus or locally in the SG, and how this ultimatly effects functionality in the
service. Since we are obliged to answer our answers will only relate to the Renardus service and not the local
ones. The main title is the original title of the resource, we don,t wnant to see alternative (other) titles in
Renardus., ie no repetition of main title and no alternative title.
SUB: 1.2: It is desirable for all SG, that they will support a title.alternative for the future Renardus system 1.4: It
is desirable for the future system that the main title is provided in English 1.5: In general: This should be an
issue for WP 7. If it works, this is desirable. 1.6: This works only with a language tag for title and title
alternative (also because of stop words: different meanings of „stop-words“ in different languages)
FVL: The main title should be provided in the language of the resource. The (repeatable) Title.alternative
element could contain the (manually translated - if needed) English title, acronym. (The Title.alternative is not
repeatable at this moment in the FVL.) Email 14.08.200: 1.2: This means, that that e.g. translated title and
acronym could be provided also in the same (not repeatable) field. At this moment the FVL utilises this practice.
NOVAGate: Title and title.alternative are cross-searchable if you don’t limit the search only to title-field
Creator: rules (2.2 – 2.9)
DTV/NetLab: expensive
SUB: for the interoperability (issue of WP 7) of the Renardus system it might be useful to implement authority
files, especially if the amount of data increases, e.g. by extension with OPACs. We also should keep an eye on
Dublin Core, they thought about implementation of vcard
BnF: Question 2.5 Syntax: This question is OK for personal names but doesn't concern the corporate names. In
our point of view, the corporate bodies are more numerous than the personal names. Question 2.7 authority file:
Does it mean to create a link to an existing authority file or to create a specific authority file for Renardus ? In
our point of view, it should be a link to an existing authority file.
Creator: additional information (2.10 – 2.16)
SUB: 2.16: that depends on the agent discussion of Dublin Core, general: We have to keep in mind that it is not
realizable to repeat the creator field if we use HTML standard, with RDF this will be possible!
BnF: Additional information must be addded in separate fields
FVL: Any extra additional information (Email-address, organisational information) related to creator should be
provided in same creator field with last name and first name. This is the simpliest solution (and maybe suitable
for every participating SG)
NOVAGate: all additional information have to be gathered on a voluntary basis
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 5
6
Description: general (3.1)
SUB: It would be helpfull to have a language tag for the repeatable description field in case several descriptions
are provided in different languages
Description: description + keywords (3.2 – 3.5)
DTV/NetLab: some of description and subject must be in English
SUB: 3.2: for the future: this should be required because of the cross-search functionality
BnF: Does it concern keywords extracted out of the description for indexing purpose or do we have the
description in one field and keywords in an other field ? In our point of view, we should have only one field for
Description and one field for Subject Keyword. 3.2: In order to facilitate the handling of other languages for
search languages for search purposes, we will be able to provide English keywords which are the LCSH
equivalents besides the RAMEAU Subject Headings.
Description: multilinguality (3.6)
SUB: This will be an issue of WP 7
ZADI: It would be good, but at this time it seems to be unrealistic
Subject: keywords – general (4.1 – 4.2)
DTV/NetLab: for normalisation in Renardus every keyword has to be in an element entity of it's own, which
naturally does not say anything about how we are to display it.
Subject: form of keywords (4.3 – 4.7)
DTV/NetLab: keywords must separable by Renardus. This is done in the export function/normalization process
and should take into account different languages
BnF: Questions 4.3, 4.4, 4.5 and 4.6: In these 4 questions, there is a confusion between the nature of the
subjects (free or controlled), their use (in one or more catalogs) and the level of the structuration (a single list
(not structured) versus thesaurus). In our point of view, the only significativ differences must be: A. free
keywords versus controlled keywords, B. specific thesaurus versus general thesaurus (encyclopedic).
FVL: The form of keywords in different subject fields should be indicated for the user in the search page
(advanced search form)
NOVAGate: There are two fields for Enlish keywords: one for thesaurus based keywords (Agrovoc) and the
other for free keywords. All keywords in nordic languages are in the same field
Subject: keywords – multilinguality (4.8)
SUB: This will be an issue of WP 7
ZADI: desirable in future, but now impossible
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 5
7
BnF: If it concerns free keywords, we could have an automatic translation. In the case of controlled subjects, we
cannot have automatic translation but we can make a mapping or a "linking" between the subjects in different
languages as we are doing in the MACS project (no evaluation)(http://www.bl.uk/information/finrap3.html).
Subject: classification – general (4.10 – 4.15)
ZADI: Renardus should not use an existing classification system, but should be oriented on a suited
classification, if there is any, DDC for description of document types, not possible for subject descriptions of
sources
BnF: We have to define which level of granularity within the DDC we would like
FVL: We can test existing systems (UDC, DDC) in general level. If they aren´t suitable, then we can create a
home-grown classification
Subject: classification system - cross-search with regard to a special subject classification (4.16 – 4.20)
DTV/NetLab: Basic field for topical search should combine title, description and subject
FVL: Cross-searching between main-classes is enough at this moment. If end user wants more exact search
functions, Renardus could advise her/him to use the subject specific database (FVL evaluates the whole section
with definitely not)
SUB: with regard to the verbal description of the classification system: it is necessary to provide for each verbal
description also the notation of the classification system or the general subject (as a scheme?), otherwise there
will be a mixing of all verbal descriptions in the search/metadata browse index and users can’t assign the
description to a subject
Subject: classification systems – multilinguality (4.21)
ZADI: basis must be an English classification
FVL: yes, the common classification system should be provided in several European languages. Renardus needs
user interfaces for different languages. Anyway, the English interface has the priority
Identifier: general - regarding resources in several languages (5.1 – 5.2)
DTV/NetLab: Use one record for each language version of the resource
BnF: At the BnF, we provide the URL of the site in an other language within the description field
FVL: This field is not essential element in search
DDB: Resources in different languages are separate resources with separate metadata sets. There is no reason to
have a repeatable field for this case
Identifier: general - regarding mirrored/copied resources (5.3 – 5.5)
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 58
BnF: In your point of view, what could this special Renardus scheme be? We must re-use an existing one and
not create a new one. We'd prefer to use the Qualifiers "Is version of" and "Has version"
DDB: We should consider that there should be separate fields for urn and url. In the case of copies or mirrors
the resources have only one urn but may have several urls. The url field must be repatable.
Identifier: Qualifier (5.6 – 5.9)
DutchESS: PURLS have the form of a URL and it is not necessary to treat them as a separate category from
URLs. URI is a collective category, including URLs, PURLs and URNs.
DTV/NetLab: What do you mean by 'integrate'?
BnF: URL, ISBN, URI, PURL, URN must be in separate fields but in the same index
FVL: Let´s dedicate this field only for URLs. There is no use to make a too complicated system
DDB: URIs are urns and urls. There are already questions for both
Language: code (6.2 – 6.4)
DutchESS: May support a language code in the future.
DTV/NetLab: Use DC recommendation: 639-2
FVL: The FVL uses ISO Code 639 with three letters
DDB: 639 two letters is deducible from 639 three letters
Country: general (7.1 – 7.3)
DTV/NetLab: How many SBIGs support this?
SUB: The publisher country code as well as the server country code are useful
FVL: The FVL will add country code in the near future to its records
Country: code (7.4 – 7.5)
DutchESS: May support a country code in future
FVL: ISO code with three letters would be better for the FVL
Type: general (8.1 – 8.5)
DutchESS: Like country and language: we may support a type element in the future
DTV/NetLab: DC model is DCT1 which should be combined with others
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 59
SUB: see DCT2: Dublin Core Type Vocabulary: Subtypes Working Draft
http://lcweb.loc.gov/marc/dc/subtypes-20000612.html)
ZADI: DC based is supported in parts, other lists should be proofed before a definitely decision is made
FVL: Qualifiers are not needed - simple type list is the best
DDB: I hope that DC type will be reconciled with the other code lists
Rights (9.1 – 9.7)
DTV/NetLab: local info
SUB: This element is also important for business models between Subject Gateways and Renardus, between
Renardus and other service providers etc.
FVL: The rights field isn´t useful for the majority of internet resources. Anyway: if there is a need for special
rights information, you can add it to the description field
NOVAGate: We don’t have the separate field for rights, but we tell about access restrictions in the description /
abstract field
DDB: 9.1 to 9.7 are no alternatives
Publisher (10)
BnF: We need to define the content of the publisher field
FVL: Essential elements in search
Unique Record Number (IV B)
DTV/NetLab: see question IV D (strongly recommended)
FVL: This could be the unique records number, which is automatically generated by every SG
DDB: If data is held distributed there is no cause of ambiguity
Record Creator (IV C)
SUB: this might be important, e.g. if reviews are provided by people wellknown in the scientic community,
users might be interested in the name of them
DDB: That's a matter of the special gateway
SBIG ID (IV D)
DDB: Only reasonable if there is a central database
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 60
Record Last Checked Date (IV E)
DutchESS: Only a "last update date", not a "last checked date" so actual changes are reflected, but not every
check which has not resulted in change
DTV/NetLab: This is local information and not relevant for Renardus
SUB: this is an important part of quality check/control
DDB: That's a matter of the special gateway
6
APPENDIX D: SUMMARY
Summary of responses (matrix):
http://www.sub.uni-goettingen.de/ssgfi/reynard/wp6/d6.4/summary_d6_4.pdf
7
APPENDIX E: Data Model and Data Flow
Data model and data flow, draft version 0.3 (4. September 2000)
http://www.sub.uni-goettingen.de/ssgfi/reynard/wp6/d6.4/data_model.pdf
BIBLIOGRAPHY
8 BIBLIOGRAPHY
AACR2 Translation project (http://lcweb.loc.gov/loc/german/AACR2/AACR2translation.html)
BUBL LINK - Browse by Dewey Class (http://bubl.ac.uk/link/ddc.html)
Business issues for Internet information gateways (Michael Day, UKOLN(
(http://www.ukoln.ac.uk/metadata/renardus/wp8/issues/)
Cross-browsing in Renardus: Usage of subject vocabularies at Renardus gateways, by Traugott Koch
(http://www.lub.lu.se/renardus/class.html)
Dempsey, L., 2000, The subject gateway: experiences and issues based on the emergence of the Resource
Discovery Network. Online Information Review, 24 (1), 8-23.
Koch, T., Day, M., 1997, The role of classification schemes in Internet resource description and discovery.
DESIRE deliverable D3.2 (3), (http://www.ukoln.ac.uk/metadata/desire/classification/)
MACS project (http://www.bl.uk/information/finrap3.html)
RDN Cataloguing Guidelines (http://www.rdn.ac.uk/publications/cat-guide/)
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 61
REFERENCES
9 REFERENCES
AACR2 and Seriality (Library of Congress) (http://lcweb.loc.gov/acq/conser/serialty.html)
Cliff, P., 2000, RDN Resource Types, v. 1, (http://www.rdn.ac.uk/publications/cat-guide/types/)
Codes for the Representation of Names of Languages – ISO 639-2 (http://lcweb.loc.gov/standards/iso639-
2/englangn.html)
CULTURAL HERITAGE PROJECTS CONCERTATION EVENT
(http://www.cscaustria.at/events/concertation.htm)
Day, M., Cliff, P., 2000, RDN Cataloguing Guidelines, v. 1.0, (http://www.rdn.ac.uk/publications/cat-guide/)
DC Agent Qualifiers - DC Working Draft - 10 December 1999 (http://www.mailbase.ac.uk/lists/dc-
agents/files/wd-agent-qual.html
[DCMES version 1.1] Dublin Core Metadata Element Set, Version 1.1: Reference Description,
(http://purl.oclc.org/dc/documents/rec-dces-19990702.htm)
[DCMES Qualifiers (2000-07-11)] Dublin Core Qualifiers, (http://purl.org/dc/documents/rec/dcmes-qualifiers-
20000711.htm)
DCT2: Dublin Core Type Vocabulary: Subtypes Working Draft (http://lcweb.loc.gov/marc/dc/subtypes-
20000612.html)
Dempsey, L., 2000, The subject gateway: experiences and issues based on the emergence of the Resource
Discovery Network. Online Information Review, 24 (1), 19.
Dewey Decimal Classification (http://www.oclc.org/dewey/about/about_the_ddc.htm)
Dublin Core Type Vocabulary: Subtypes Working Draft (http://lcweb.loc.gov/marc/dc/subtypes-
20000612.html)
Dublin Core Type Working Group, 1999, List of Resource Types. Dublin Core Metadata Initiative Working
Draft, (http://purl.org/dc/documents/wd-typelist.htm)
First SCHEMAS Workshop on 11/12 Mai (http://www.schemas-forum.org/workshops/ws1/agenda.html)
Gray, L., 2000, Cataloguing rules for the BIOME Service: a procedural manual
(http://biome.ac.uk/guidelines/cat/)
Humbul, 2000, Describing and cataloguing resources in Humbul, v. 0.4a. Draft, 26 October.
(http://www.humbul.ac.uk/about/catalogue.html)
ISO 3166 Maintenance Agency (http://www.din.de/gremien/nas/nabd/iso3166ma/)
ISO 639-2 Registration Authority – Library of Congress (http://lcweb.loc.gov/standards/iso639-2/)
ISO 639-2:1998, Codes for representation of names of languages - Part 2: Alpha-3 code. Geneva: International
Organization for Standardization.
MacLeod, R., Kerr, L., Guyon, A., 1998, The EEVL approach to providing a subject based information gateway
for engineers. Program, 32 (3), 205-223.
Deliverable: D6.4 Data model (first final versiont) Issue: 1.0 Date of issue: 17 Novemberr 2000
Reynard IST-1999-10562 62
Mapping ROADS/IAFA templates to Dublin Core
(http://www.ukoln.ac.uk/metadata/interoperability/iafa_dc.html)
Personennamendatei (PND) (http://www.ddb.de/professionell/pnd.htm)
RAMEAU (http://www.bnf.fr/web-bnf/infopro/rameau/)
RFC 1766 Tags for the identification of languages (http://info.internet.isi.edu/in-notes/rfc/files/rfc1766.txt)
Renardus architectural model, (http://www.konbib.nl/coop/reynard/restricted/architecture2.ppt)
RSLP Collection Description (http://www.ukoln.ac.uk/metadata/rslp/)
RSLP Collection Description: Tool (http://www.ukoln.ac.uk/metadata/rslp/tool/)
Simple Collection Description (draft version: 2. August 1999) (http://www.ukoln.ac.uk/metadata/cld/simple/)
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Lorcan Dempsey is the director of UKOLN at the University of Bath, and co‐Director of the Resource Discovery Network. UKOLN is supported by the Library and Information Commission, JISC, and the University of Bath. The RDN is funded by the JISC. Thanks are due to Derek Law for commenting on the section about the policy background to the eLib gateways, and to Traugott Koch for inviting the contribution. Thanks also to Ray Lester and to Nicky Ferguson for some specific discussion. Any views expressed are those of the author alone.
Article
EEVL, the Edinburgh Engineering Virtual Library, is a gateway to engineering information on the Internet. After a brief outline of the need for such a gateway and the background to the EEVL project, this article looks at certain similarities and differences in the development of EEVL and various other subject based information gateways (SBIGs) such as ADAM, SOSIG, and OMNI, and similar services such as BUBL. EEVL’s present situation and future prospects are outlined.
RDN Cataloguing Guidelines, v. 1.0, (http://www.rdn.ac.uk/publications/cat-guide/) DC Agent Qualifiers -DC Working Draftwd-agent-qual .html [DCMES version 1.1] Dublin Core Metadata Element Set, Version 1.1: Reference Description
  • M Day
  • P Cliff
Day, M., Cliff, P., 2000, RDN Cataloguing Guidelines, v. 1.0, (http://www.rdn.ac.uk/publications/cat-guide/) DC Agent Qualifiers -DC Working Draft -10 December 1999 (http://www.mailbase.ac.uk/lists/dcagents/files/wd-agent-qual.html [DCMES version 1.1] Dublin Core Metadata Element Set, Version 1.1: Reference Description, (http://purl.oclc.org/dc/documents/rec-dces-19990702.htm)
Cataloguing rules for the BIOME Service: a procedural manual
  • L Gray
Gray, L., 2000, Cataloguing rules for the BIOME Service: a procedural manual (http://biome.ac.uk/guidelines/cat/)
Describing and cataloguing resources in Humbul, v. 0.4a. Draft, 26 October
  • Humbul
Humbul, 2000, Describing and cataloguing resources in Humbul, v. 0.4a. Draft, 26 October. (http://www.humbul.ac.uk/about/catalogue.html)
RDN Cataloguing Guidelines
  • M Day
  • P Cliff
Day, M., Cliff, P., 2000, RDN Cataloguing Guidelines, v. 1.0, (http://www.rdn.ac.uk/publications/cat-guide/)
uk/lists/dcagents/files/wd-agent-qual .html [DCMES version 1.1] Dublin Core Metadata Element Set, Version 1.1: Reference Description
  • Dc Agent Qualifiers-Dc Working
DC Agent Qualifiers -DC Working Draft -10 December 1999 (http://www.mailbase.ac.uk/lists/dcagents/files/wd-agent-qual.html [DCMES version 1.1] Dublin Core Metadata Element Set, Version 1.1: Reference Description, (http://purl.oclc.org/dc/documents/rec-dces-19990702.htm)