1.1 What is metadata?
The term metadata is used differently in different communities.
- Metadata as a concept has been used in different contexts to refer to information that is about specific things:
- e.g., catalogs of published materials or museum objects,
- finding aids for archival materials,
- indexes of journal articles
- These are examples of metadata commonly seen in libraries, archives, and museums (LAMs).
- Broadly speaking, metadata encapsulate the information that describes any information-bearing entity.
- Similar to “data,” “metadata” can be either singular or plural.
NISO's Understanding Metadata" (2004) defines metadata as:
- "structured information that describes, explains, locates, or otherwise makes it easier to retrieve, use, or manage an information resource. Metadata is often called data about data or information about information".
Source: NISO (2004). Understanding Metadata. Bethesda, MD: NISO Press. http://www.niso.org/standards/resources/
(last accessed 2015-08-01)
The American Library Association (ALA) Committee on Cataloging: Description and Access (CC:DA) presented the formal working definitions for the three terms, after a study of 46 potential definitions:
- Metadata are structured, encoded data that describe characteristics of information-bearing entities to aid in the identification, discovery, assessment, and management of the described entities.
- A metadata schema provides a formal structure designed to identify the knowledge structure of a given discipline and to link that structure to the information of the discipline through the creation of an information system that will assist the identification, discovery, and use of information within that discipline.
- Interoperability is the ability of two or more systems or components to exchange information and use the exchanged information without special effort on either system.
Source: CC:DA Task Force on Metadata:
Final Report (CC:DA/TF/Metadata/5), June 16, 2000
(last accessed 2015-08-01)
Dublin Core Metadata Initiative (DCMI)'s Glossary
- In general, "data about data;" functionally, "structured data about data."
- Metadata includes data associated with either an information system or an information object for purposes of description, administration, legal requirements, technical functionality, use and usage, and preservation. In the case of Dublin Core, information that expresses the intellectual content, intellectual property and/or instantiation characteristics of an information resource.
Source: DCMI Glossary,
(last accessed 2015-08-01)
1.2 The development of metadata
In her introduction to a Special Topic Issue: Integrating Multiple Overlapping Metadata Standards of the Journal of the American Society for Information Science, Zorana Ercegovac (1999) provided an overview of the metadata development, in which she led readers through the pre-Internet era and the Internet era:
Pre-Internet Era of Metadata
MAchine Readable Cataloging (MARC). http://www.loc.gov/marc/
- Developed at the Library of Congress in 1960s.
- In terms of specificity, structure and maturity, it is a highly structured and semantically rich metadata.
- (1) to represent rich bibliographic descriptions and relationships between and among data of heterogeneous library objects; and
- (2) to facilitate sharing of these bibliographic data across local library boundaries.
- The emphasis is on the entire document;
- the surrogates are MARC records;
- the records are produced by human catalogers;
- MARC does not fare well with regard to
- management needs (e.g., intellectual property, preservation), or
- evaluative needs (e.g., authenticity, user profiles, and grade levels).
The Internet Arena and Evolving Metadata Traditions
Since the early 1990s,
- distributed repositories on the Internet have had an exponential growth,
- repositories are contributed by different communities,
- there is a need to describe, authenticate, and manage these resources,
- therefore, new guidelines and architectures are developed among different communities.
Priscilla Caplan described the metadata movement as "a blooming garden, traversed by crosswalks, atop a steep and rocky road" (Caplan, 2000).
- This metadata "blooming garden" can be viewed from different perspectives:
(1) There is no limit for the type or amount of resources that can be described by metadata.
For any area that shows a demand for electronic resource discovery and sharing, a metadata standard can be developed or proposed.
Today, the resources described by metadata consist of:
- bibliographical objects (e.g., as represented by MARC records, Dublin Core descriptions),
- archival inventories and registers (e.g., EAD encoded finding aids),
- geospatial objects (e.g., FGDC metadata),
- museum and visual resources (e.g., CDWA, VRA Core, IPTC metadata),
- educational materials (e.g., LOM metadata),
- biological diversity (e.g., Darwin Core),
- people (as encided with Friend of a Friend (FOAF) schema),
- any webpage's contents (embedded with Schema.org properties)
- and many others.
The use of these metadata standards is not limited by language or country boundaries.
(2) There is no limit for the number of overlapping metadata standards for any type of resources or any subject domain.
Variant systems are often found even within a single subject community.
In describing museum and visual resources, for instance, there are at least nine well-structured and well-documented metadata schemas, ranging from very comprehensive and detailed ones to the more general and open cores.
(3) There is no limit for the types of profession or subject domain that can be involved in metadata standard development and application.
Metadata and Organizing Educational Resources on the Internet (Greenberg, 2000) documents the experiences of those who are actively engaged in projects that organize Internet resources for educational purposes, including metadata creators (both catalogers and indexers), library administrators, and educators.
The National Science Digital Library (NSDL) established a Metadata Repository based on the metadata records harvested from nearly 100 digital collections funded by the National Science Foundation. The collections and the metadata for the collections and items were built by educators of K-12, undergraduate, and graduate schools, together with publishers, scientists, engineers, medical doctors, professional associations, and so on.
The Semantic Web and Linked Data movement
In recent years, the concepts and technologies of the Semantic Web (i.e., the Web of Data), Linked Open Data, and Big Data have changed the world. Metadata -- structured data about other data -- are receiving even wider and greater attention than ever before.
LAMs were among the first to publish their data (including bibliographic data, object descriptions, name authorities, and controlled vocabularies) as Linked Data.
Caplan, Priscilla (2000). International metadata initiatives:
lessons in bibliographic control.
Paper presented at: Conference on
Bibliographic Control in the New Millennium,
Library of Congress, November, 2000.
Ercegovac, Zorana (1999).
Introduction. In: Integrating Multiple Overlapping Metadata Standards, a Special Topic Issue.
Journal of the American Society for Information Science, 50(13).1165-1168
Jane Greenberg ed. (2000).
Metadata and Organizing Educational Resources on the Internet.
(A monograph published simultaneously as the
Journal of Internet Cataloging, Vol. 3, Nos. 1 and 2/3.)
The Haworth Press, Inc.
The National Science Digital Library (2004).
2004 Annual Report. Part 1.
W3C Incubator Group (2011)
Library Linked Data Incubator Group Final Report, W3C Incubator Group Report 25 October 2011.
<-- Back to Table of Contents |||| Go to Next Section -->