ALCTS - Association of Library Collections & Technical Services

Final Report (continued)

Table of Contents

Executive Summary

Metadata and Cataloging
The TEI Header and the Cataloging Rules
Dublin Core Metadata and the Cataloging Rules
Encoded Archival Description: Summary Report

Appendix: Cataloging Problems with Web Sites


The following conclusions were drawn by the Task Force based primarily on evaluation of two metadata standards, the TEI header as utilized by the Electronic Text Center at the University of Virginia and the Dublin Core metadata element set. The conclusions specifically address the use of metadata, and records derived from metadata, in library catalogs based on content standards such as the Anglo-American Cataloguing Rules and Library of Congress Subject Headings.

  1. Metadata will be used most effectively in systems designed specifically to support its use, not in library catalogs.

    Unless the metadata is created according to semantic or content guidelines that are derived from AACR2 and rules for creating and using subject vocabularies such as MeSH and LCSH, metadata will not be effectively integrated in library catalogs. However, metadata may serve as a useful source of information for catalogers as they catalog electronic materials.

  2. Most metadata standards are not attempting to provide information sufficient to distinguish a resource from similar resources or versions of the same resource. The guidelines for data content in most metadata standards do not mandate transcription of data from prescribed sources of information.

    Most metadata is not intended to serve as, or duplicate, a full surrogate record. Dublin Core records in particular are formulated to enhance resource discovery, not resource description. The assumption is that a DC metadata set either links directly to a full description such as a library catalog record, or to the resource itself. Because of these limited objectives, the guidelines governing the data content of metadata in standards such as the Dublin Core and the TEI header do not mandate transcription of data from prescribed sources of information, as is required by AACR2. Such metadata therefore is often insufficient to distinguish a given resource from similar resources or from other versions of the same resource.

  3. Authority control over metadata content is usually optional, not required. The Dublin Core, for example, supports specification of authoritative source for elements such as Creator and Subject, but does not require that content come from an authoritative source. When metadata lacks authority control, either that control should be applied before the metadata records are added to the library catalog, or the records should be stored in a separate database in which consistency of content is not expected.

    The unsuitability of metadata for direct use in library catalogs is due, mainly, to the divergence in how library catalogs and metadata use and don’t use, respectively, semantic or content standards and data registries such as name authority files and controlled subject vocabularies. The integrity of a library catalog is based on the consistent application of a well-defined set of standards and utilization of carefully maintained data registries such as name authority files and controlled vocabulary lists. Authority control is not required or expected in metadata, although metadata element sets do support the use of authorized forms of names and subjects and provide conventions to indicate authoritative sources when controlled headings are used. When a significant portion of the records in a database fails to meet the criteria for consistency and standardization, the database loses its integrity and can fulfill neither the objectives defined by Cutter and others, nor those expected by the users of the database.

  4. Metadata makers have focused on mapping to rules for transfer syntaxes like MARC and not to rules for semantics or content like AACR.

    In general, neither Dublin Core nor TEI headers are likely to meet the criteria for consistency and standardization. Many metadata standards creators have focused on mapping to rules for transfer syntax and have ignored or not discovered rules for semantics or content. A data element may map to USMARC, but not meet the underlying AACR2 requirement. So while mapping various metadata element sets to USMARC has clarified the relationships of metadata schemes to the commonly used transfer syntax, work to ensure the vitally important mapping to the Anglo-American Cataloguing Rules has not yet occurred.

  5. Metadata from a non-AACR-conformant source will always require cataloger scrutiny when being integrated into a library catalog.

    The simplicity of the unqualified Dublin Core element set definitions, the lack of specificity in these definitions, the fact that all fifteen elements are optional, and that it is designed to be applied by resource creators with no bibliographic background, make DC less reliable than traditional sources of cataloging copy and thus more likely to require careful cataloger review before it can be successfully integrated into a library catalog.

    However, metadata from a source known to be applying AACR2 conventions can be used as cataloging data with minimal or no cataloger scrutiny and intervention. TEI header metadata will require less manual review than most metadata (based on the UVA Electronic Text Center project) because there is a more developed set of rules for use, the choice of elements more closely approximates the range of data recommended in AACR2 for level 2 and level 3 cataloging, and mandatory elements are included.

  6. Metadata element sets like the Dublin Core must be supported by institutions serving as responsible custodians or authorities for data registration and maintenance.

    It is not clear is how the data registration authority and maintenance will work for the Dublin Core qualified element set and similar metadata element sets. That is, who is responsible for keeping the definitive list of characteristics of data elements that is necessary to clearly describe, inventory, analyze and classify data in support of data sharing across systems and organizations? The responsible authority and custodian for AACR2 is the Joint Steering Committee. MARBI and the Network Development and MARC Standards Office, Library of Congress, are the responsible agents for USMARC. The TEI Header element set has such an authority in the guise of TEI editors and a TEI editorial process. No such authority exists for the Dublin Core. Who should be the responsible agency?

Next Section