Indexes and Index Rules

Each record is indexed as it is added to the system. That is, the system uses index rules to move various portions of the record's data into database indexes. The record then can be retrieved by comparing search terms to the contents of a specified index. For example, a bibliographic record containing the field:

700 |aJames, Henry |tPortrait of a lady

could be indexed so that it would be retrieved by searching the author index for "james, henry", searching the title index for "portrait of a lady", or searching the keyword index for "portrait", "lady", "portrait lady" or "lady portrait".

This documentation covers the standard set up of indexes. However, a particular organization's indexing may vary from this standard. For more information on viewing the codes used in your system, see Administering Advanced Word Searching.

You can edit index labels through the Index Labels system table.

Indexes and Internationalization

Non-English special characters and character sets (such as Arabic, CJK, Cyrillic, Hebrew, and others) require indexing to be searched on the system.

Additional indexes are a product. For more information, contact Sales.

Indexes

Sierra uses three types of indexes:

Record number
Phrase
Advanced Word (keyword)

Record Number Index

The record number index is system-defined and contains the system-generated unique ID number of each record. The record number is sequential, assigned to each record as it is created, and the final digit is a check digit. For example:

.b1234567X
.o11111119
.p10305071

For more information, see Record Numbers.

Phrase Index

A phrase index is created from a specified string of text in a variable-length field. A phrase index includes the entirety of the MARC field's specified subfields in a single index entry, in the order in which they appear.

There is one set of phrase indexes for the entire system
There is a maximum of 25 different phrase indexes
Some indexes are shared between records. For instance:
- The author index is used by bibliographic and authority records
- The barcode index is used by item and patron records
Some indexes are unique to a record. For instance:
- The name index is used by patron records only
- The instructor index is used by course records only

For more information, see More About Phrase Indexes.

Advanced Word Search Index

The Advanced Word Search (AWS) index takes a specified string of text from a variable-length field and indexes each word separately.

The AWS index is primarily used by bibliographic records, but other records are also included as defined in the system software

Although there is one AWS index, it is made up of segments which can be searched separately. The segments are:

Segment	Includes
Author	all entries in the author phrase index
Title	all entries in the title phrase index
Subject	all entries in the subject phrase index(es)
Note	entries from MARC tags not included in the author, title and subject phrase indexes. For instance, note data from the 5XX MARC tags

Numeric fields such as call number, ISBN, and other standard numbers are not included in the Advanced Search index

For more information, see More About the AWS index.

Index Rules

Index rules determine what data is indexed and how it is normalized for indexing.

About Index Rules

Index rules for a record type contain one or more lines for each variable-length field to be indexed. The lines indicate the index(es) into which the variable-length field and its subfields are to be indexed.
Variable-length fields that do not match an index rule are not indexed. The combination of field group tag + MARC tag + indicators + subfields must exactly match for the field to be indexed.
To be indexed, a variable-length field must be stored in the field tag specified in the rule and must have the MARC tag (and indicators if listed) specified in the rule.
Field group tags, MARC tags, MARC tag indicators and subfields are used in index rules.
A single MARC tag can be assigned to more than one index.
An index rule can specify subfields to include or exclude.
Different subfields from the same MARC field can be used in different indexes.
Each index can include many MARC fields and subfields.
Non-MARC and MARC fields are indexed together.

Normalizing Data

Normalization is the process where data is standardized to improve sorting and retrieval.

The standard normalization process does the following:

Takes the first 150 characters of the string to be indexed
Strips non-filing characters from titles as designated by MARC tag indicators
Strips apostrophes and diacritics
Converts ampersands to the word for "and" in the primary language of your system
Retains the punctuation symbols + # $ % @ within the index
Replaces subfield delimiters and many other punctuation marks with a space
Collapses multiple spaces to a single space
Indexes the first 125 characters of the normalized string

The system also uses specialized normalization processes:

For barcode and standard number indexes, all spaces and punctuation marks are stripped
Dewey, LC, NLM, SuDOC, and UDC classification schemes have unique normalization rules
In the AWS index, each word is indexed separately
Also in the AWS index, most words that include diacritics are indexed with and without the diacritics

Viewing Index Rules

You can view the system indexing rules in the Admin App by clicking AWS (for AWS index rules) or System Files | Indexrules (for phrase index rules).

See also:: Administering Advanced Word Searching (to view AWS index rules); Editing Index Labels; Call Number Normalization; Indexrules in System Files (to view phrase index rules)