ALCTS - Association of Library Collections & Technical Services

Final Report (continued)


Table of Contents

Executive Summary

Introduction
Metadata and Cataloging
The TEI Header and the Cataloging Rules
Dublin Core Metadata and the Cataloging Rules
Encoded Archival Description: Summary Report
Conclusions
Recommendations
Bibliography

Appendix: Cataloging Problems with Web Sites


Metadata and Cataloging:
Supporting Common User Tasks

By John Attig, Pennsylvania State University Libraries


In assessing the relationship between metadata and cataloging data, we propose to examine the suitability of each to fulfilling various user tasks. These tasks have been formulated as part of an entity-relationship model by an IFLA Working Group and published as part of the document Functional Requirements for Bibliographic Records. [Note: The final text of this document was not yet available, and the following analysis is based on the draft version circulated for world-wide review in May 1996.]

The IFLA model proposes four basic user tasks:

  • to find entities that correspond to the user’s stated search criteria (i.e., to locate an entity in a file or database as the result of a search using an attribute or relationship of the entity);

  • to identify an entity (i.e., to confirm that the entity described corresponds to the entity sought, or to distinguish between two or more entities with similar characteristics);

  • to select an entity that is appropriate to the user’s needs (i.e., to choose an entity that meets the user’s requirements with respect to content, physical format, etc., or to reject an entity as being inappropriate to the user’s needs);

  • to acquire or obtain access to the entity described (i.e., to acquire an entity through purchase, loan, etc., or to access an entity electronically through an on-line connection to a remote computer).

Cataloging data seeks to support all four user tasks (although the obtain task is mostly supported by local information that is not part of international cataloging standards). In addition, cataloging data is optimized to support each task and seeks to maximize the user’s chances of success in their efforts. Standard cataloging principles, rules, and practices have developed over the past century, and contemporary cataloging databases embody high standards of information quality. Considerable effort has been expended by catalogers throughout the world, working cooperatively to promote this quality. In particular,

  • FIND: Cataloging principles promote the user’s ability to find a desired object in several ways.

    1. Rules and practices seek to identify those attributes of a bibliographic entity most likely to satisfy a user’s query. These practices are based on commonly-recognized principles of authorship, organizational/corporate responsibility, roles of various persons and bodies in the creation of various kinds of bibliographic entities; on recognized conventions for naming persons, publications, and other relevant entities; on long experience in assessing the significance of relationships among entities.

    2. Rules and practices that seek to optimize retrieval by insuring (a) that every entity has a distinct name and (b) only one name is conventionally used for each entity. Further practices seek to insure that all variant forms of name are retrievable. This practice is commonly known as authority control and is the single most significant contribution of catalogers to the retrieval process.

    3. Rules and practices that base the conventional form of names, titles, etc., on literary warrant or the common usage within the universe of bibliographic entities.

  • IDENTIFY: Cataloging principles promote the user’s ability to identify a retrieved entity, i.e., to determine whether that entity indeed satisfies the user’s query and to distinguish among entities with similar attributes. A number of principles and practices promote this task:

    1. The basic principle of cataloging description – going back to the work of Panizzi and Cutter (if not earlier) – is that bibliographic entities are most usefully identified by the information that is contained in the entities themselves and that the most formal statements tend to be the most useful. Therefore, a bibliographic description is constructed by transcribing information from prominent sources within the item, e.g., from the title page of a book. This principle applies particularly to such information as titles, the role of various creators/contributors, and publication information. However, it also applies to such information as technical requirements for using the material. Although not all significant information about an entity can always be obtained by transcribing data, it is the best place to start. The result of such transcription is a surrogate for the entity that allows provides the user with a wealth of identifying information prior to selecting and obtaining a copy of the entity.

    2. Considerable attention is paid in cataloging rules and practices to distinguishing among entities with similar attributes, such as various manifestations of the same work (different versions of the text or different physical manifestations). There are specific technical categories, applicable in particular to printed works, that help to identify the extent of difference among manifestations (edition, issue, printing, etc.). When a user needs to obtain a particular manifestation of a work, such information is of vital importance. The world is still working out how to apply similar concepts to electronic entities, but the ease of duplication and modification of electronic entities makes this a very important issue.

    3. Beyond these basic principles and practices, cataloging conventions embody a century of experience in determining what additional data elements are useful in identifying an item and in assessing its relevance to a user’s query. Examples of such data elements are technical details of various kinds, content analysis (not only a list of contents, but indications of the presence or absence of features such as illustrations, indexes, bibliographies, etc.), relationships to other entities (other works, persons and organizations contributing in various ways).

  • SELECT: Users select entities for retrieval based on a variety of factors. Many of them have to do with the identification of the specific entity or with the presence of specific features; these have been addressed in the previous section. In addition, users select on the basis of factors such as currency of information, level of treatment (textbook survey vs. scholarly monograph), extent of treatment, authoritativeness (author’s or publisher’s credentials), subject relevance. Almost any data element included in a bibliographic record may be important in influencing a user’s selection of material.

  • OBTAIN: In order to obtain a copy of a selected entity, the user must rely on the services provided by the entity’s owner/custodian/provider. For most material physically held in a library or repository, this is the library’s identity and the identifier(s) assigned to the item (such as a call number). This is the least standardized aspect of library cataloging practices, but the wide availability of databases of cataloging information is promoting similar systems for obtaining documents on a regional, national, even global basis. With electronic information, obtaining access is perhaps even simpler. All that is needed is an accurate identifier.

One way to evaluate the usefulness of a particular metadata element set is to compare it against this standard. Which user tasks is the metadata intended to support? How does it support each task? In looking at the TEI header and Dublin Core, we have tried to compare the support of each for these common user tasks against the support expected from a bibliographic record created according to cataloging standards and practices.


Next Section