1. Why Vocabulary Control

Vocabulary control is used to improve the effectiveness of information storage and retrieval systems, Web navigation systems, and other environments that seek to both identify and locate desired content via some sort of description using language. The primary purpose of vocabulary control is to achieve consistency in the description of content objects and to facilitate retrieval.

1.1 Need for Vocabulary Control (1.1)

The need for vocabulary control arises from two basic features of natural language, namely:
• Two or more words or terms can be used to represent a single concept

  VHF/Very High Frequency

• Two or more words that have the same spelling can represent different concepts

Mercury (planet)
  Mercury (metal)
  Mercury (automobile)
  Mercury (mythical being)

1.2 How Vocabulary Control is Achieved (1.2)

Vocabulary control is achieved by three principal methods:

• Defining the scope, or meaning, of terms
• Using the equivalence relationship to link synonymous and nearly synonymous terms; and
• Distinguishing among homographs.

The Standard provides guidelines for constructing controlled vocabularies:

• selecting the terms
• formulating the terms
• establishing relationships among terms, and
• presenting the information effectively in printed, online, and web navigation sites.

1.3 Purpose of Controlled Vocabularies (5.1)

The purpose of controlled vocabularies is to provide a means for organizing information. Through the process of assigning terms selected from controlled vocabularies to describe documents and other types of content objects, the materials are organized according to the various elements that have been chosen to describe them.

There are many different kinds of controlled vocabularies. Some common ones are:

• Simple lists of terms or “pick lists”
• Synonym rings
• Taxonomies
• Thesauri

The tutorial focuses on controlled vocabularies that are used for the representation of content objects.

Controlled vocabularies serve five purposes:

1. Translation: Provide a means for converting the natural language of authors, indexers, and users into a vocabulary that can be used for indexing and retrieval.

2. Consistency: Promote uniformity in term format and in the assignment of terms.

3. Indication of relationships: Indicate semantic relationships among terms.

4. Label and browse: Provide consistent and clear hierarchies in a navigation system to help users locate desired content objects.

5. Retrieval: Serve as a searching aid in locating content objects.

1.4 Controlled Vocabulary Impact on Information Retrieval (5.3.6)

Information retrieval effectiveness is traditionally measured by two parameters: recall and precision. Controlled vocabulary design can have a positive impact on both of these measures.

Recall can be improved through such controlled vocabulary methods as:

• Preferred terms and equivalence relationships for synonym control (see Z39.19 section 5.3.2)
• Preferred term form (see Z39.19 section 6.3)
• Associative (related term) relationships (see Z39.19 section 8.4)
• Classified and hierarchical relationships (see Z39.19 section 8.3)
• Postcoordination (see Z39.19 section
• Concept mapping / clustering (see Z39.19 section 9.3.5)

Precision can be improved through such controlled vocabulary methods as:

• Parenthetical qualifiers to control ambiguity (see Z39.19 section 6.2.1)
• Broader and narrower term hierarchical relationships (see Z39.19 section 8.3)
• Compound terms (see Z39.19 section 7)
• Precoordination (see Z39.19 section

See also Section 7. Use of Controlled Vocabularies in Information Storage and Retrieval Systems -- Examples taken from the real world

