Indexes and Index Rules
Each record is indexed as it is added to the system. That is, the system uses index rules to move various portions of the record's data into database indexes. The record then can be retrieved by comparing search terms to the contents of a specified index. For example, a bibliographic record containing the field:
700 |aJames, Henry |tPortrait of a lady
could be indexed so that it would be retrieved by searching the author index for "james, henry", searching the title index for "portrait of a lady", or searching the keyword index for "portrait", "lady", "portrait lady" or "lady portrait".
This documentation covers the standard set up of indexes. However, a particular organization's indexing may vary from this standard. For more information on viewing the codes used in your system, see Administering Advanced Word Searching.
You can edit index labels through the Index Labels system table.
Indexes and Internationalization
Non-English special characters and character sets (such as Arabic, CJK, Cyrillic, Hebrew, and others) require indexing to be searched on the system.
Additional indexes are a product. For more information, contact Sales.
Indexes
Sierra uses three types of indexes:
- Record number
- Phrase
- Advanced Word (keyword)
Record Number Index
The record number index is system-defined and contains the system-generated unique ID number of each record. The record number is sequential, assigned to each record as it is created, and the final digit is a check digit. For example:
- .b1234567X
- .o11111119
- .p10305071
For more information, see Record Numbers.
Phrase Index
A phrase index is created from a specified string of text in a variable-length field. A phrase index includes the entirety of the MARC field's specified subfields in a single index entry, in the order in which they appear.
- There is one set of phrase indexes for the entire system
- There is a maximum of 25 different phrase indexes
- Some indexes are shared between records. For instance:
- The author index is used by bibliographic and authority records
- The barcode index is used by item and patron records
- Some indexes are unique to a record. For instance:
- The name index is used by patron records only
- The instructor index is used by course records only
For more information, see More About Phrase Indexes.
Advanced Word Search Index
The Advanced Word Search (AWS) index takes a specified string of text from a variable-length field and indexes each word separately.
- The AWS index is primarily used by bibliographic records, but other records are also included as defined in the system software
- Although there is one AWS index, it is made up of segments which can be searched separately. The segments are:
Segment
Includes
Author
all entries in the author phrase index
Title
all entries in the title phrase index
Subject
all entries in the subject phrase index(es)
Note
entries from MARC tags not included in the author, title and subject phrase indexes. For instance, note data from the 5XX MARC tags
Numeric fields such as call number, ISBN, and other standard numbers are not included in the Advanced Search index
For more information, see More About the AWS index.
Index Rules
Index rules determine what data is indexed and how it is normalized for indexing.
About Index Rules
- Index rules for a record type contain one or more lines for each variable-length field to be indexed. The lines indicate the index(es) into which the variable-length field and its subfields are to be indexed.
- Variable-length fields that do not match an index rule are not indexed. The combination of field group tag + MARC tag + indicators + subfields must exactly match for the field to be indexed.
- To be indexed, a variable-length field must be stored in the field tag specified in the rule and must have the MARC tag (and indicators if listed) specified in the rule.
- Field group tags, MARC tags, MARC tag indicators and subfields are used in index rules.
- A single MARC tag can be assigned to more than one index.
- An index rule can specify subfields to include or exclude.
- Different subfields from the same MARC field can be used in different indexes.
- Each index can include many MARC fields and subfields.
- Non-MARC and MARC fields are indexed together.
Normalizing Data
Normalization is the process where data is standardized to improve sorting and retrieval.
The standard normalization process does the following:
- Takes the first 150 characters of the string to be indexed
- Strips non-filing characters from titles as designated by MARC tag indicators
- Strips apostrophes and diacritics
- Converts ampersands to the word for "and" in the primary language of your system
- Retains the punctuation symbols + # $ % @ within the index
- Replaces subfield delimiters and many other punctuation marks with a space
- Collapses multiple spaces to a single space
- Indexes the first 125 characters of the normalized string
The system also uses specialized normalization processes:
- For barcode and standard number indexes, all spaces and punctuation marks are stripped
- Dewey, LC, NLM, SuDOC, and UDC classification schemes have unique normalization rules
- In the AWS index, each word is indexed separately
- Also in the AWS index, most words that include diacritics are indexed with and without the diacritics
Viewing Index Rules
You can view the system indexing rules in the Admin App by clicking AWS (for AWS index rules) or System Files | Indexrules (for phrase index rules).
- See also:
- Administering Advanced Word Searching (to view AWS index rules)
- Editing Index Labels
- Call Number Normalization
- Indexrules in System Files (to view phrase index rules)