ALCTS - Association of Library Collections & Technical Services

Final Report (continued)

Table of Contents

Executive Summary

Metadata and Cataloging
The TEI Header and the Cataloging Rules
Dublin Core Metadata and the Cataloging Rules
Encoded Archival Description: Summary Report

Appendix: Cataloging Problems with Web Sites

Encoded Archival Description: A Summary Report

By Michael Fox, Minnesota Historical Society

History and Status:
Encoded Archival Description (EAD) is a standard that defines the structure and interrelationships of data elements in archival finding aids. This information is expressed in the syntax of an SGML/XML Document Type Definition (DTD). Developed by the archival community through a series of grant-funded projects, the Society of American Archivists holds the intellectual property rights to EAD. The Library of Congress serves as the maintenance standard for the DTD. For the past eighteen months, early implementers have applied the beta version of EAD to a number of institutional and consortial projects in the U.S., Canada, and Europe. Information about EAD and pointers to many of these projects may be found at

Archival Collections:
Modern archival collections include the personal papers of individuals as well as the corporate records of businesses and other organizations and government. Individual collections may range in size from a few documents to millions of items. An archival collection is typically multi-level and hierarchically structured. Individual documents are aggregated with like materials into files. Files are organized into groupings called series on the basis of common format, function or use. A collection consists of from one to many series.

Archival Description:
Given their size, it is not economically feasible or cost effective to provide item level cataloging for most collections. Rather, archivists typically create a MARC catalog record describing the entire collection and a finding aid to provide further detail. These access tools are commonly called registers or inventories. Prior to EAD, most inventories consisted of printed text documents. Like the collections they describe, inventories are multi-level. The collection as a whole is described, then each series, file, and occasionally each item. The level of detail at each level may vary with the significance, size and complexity of the collection and institutional resources. No standards existed for the structure of finding aids prior to EAD though in practice there has been a high level of similarity across institutions reflecting a rough consensus as to be best practice.

EAD Structure:
EAD is a data structure and communication standard that provides for a flexible, multi-level encoding of inventories. Like MARC, it is neither a data content nor a data values standard. The standard defines three content areas. The Archival Description area encodes the body of the inventory. Its data element structure closely parallels MARC fields, though the MARC indicator values and subfield structure is not fully replicated. The same set of data elements is available for describing materials at each level of the hierarchical structure: collection, series, file, and item. The Front Matter area provides a number of bibliographical data elements for encoding title pages where inventories are formally published documents. The Eadheader area provides metadata about the electronic version of the finding aid and incorporates many of the structural features of the TEI header, especially with respect to status and revision history.

EAD Documents as Cataloging Sources:
There is a strong connection between the content of the collection-level MARC records that archives produce and the descriptive data found in the inventory. The archival community does not see the creation of EAD finding aids as a replacement for catalog records but rather as a supplement to them. Indeed, Archives, Personal Papers and Manuscripts (APPM) defines the inventory as the chief source of information for the catalog entry. Many inventories incorporate the contents of the catalog record directly into the inventory. While there is no formal data content standard for inventories, this connection to MARC means that, at least for the part of the inventory that describes the collection as a whole, content and access points reflect AACR2 and APPM conventions. Element tags for access terms such as names may be annotated to show their MARC equivalents as well as the authority file source and identifying number for the term. In Canada, archivists are applying the Rules for Archival Description in the same way. European archivists will be utilizing national codes and the conventions of the International Standard Archival Description (General).

Other EAD Metadata Issues:
EAD differs from other metadata packages like the TEI Header and the HTML META tag in that it consists entirely of metadata and is not a metadata package that is attached to some “document-like object.” However, this may change in the near future. EAD permits one to imbed a digital representation of an object in the collection, such an image of a document or photograph, directly into the inventory. Several projects are currently underway that actually incorporate the entire contents of a collection into its finding aid. It is uncertain as to how common this application might become, but it does expand the function of EAD from simple resource discovery to resource discovery and navigation.

The multi-level nature of EAD encoded metadata will create some interesting challenges for information search and retrieval systems. A search of multiple finding aids, may point the user to relevant materials that comprise only a part of the collection, yet a part that may only be understood in terms of its placement within a larger hierarchy of documents. What data will be returned to the user from such a query?

The Eadheader is often confusing since it is really metadata about other metadata, the inventory. To which object do we want to point the researcher? To the collection itself or to the inventory? The answer is unclear at this point but it may to either or both in different circumstances. This ambiguity has manifested itself in various metadata crosswalks that have mapped to other standards a mixture of some EAD elements that relate to the electronic inventory and some that describe the underlying collection.

Next Section