Call Number Normalization

Before indexing, the data in the call number field is normalized; this helps the computer index and sort the data correctly. The call number field may be indexed using one of the following normalization schemes:

Character-by-Character
LC
Dewey
SUDOCS

The normalization scheme used to index your call numbers depends on which index you set up as part of your implementation. Each normalization scheme implies a separate index. If your library uses more than one normalization scheme, you may have purchased separate indexes for each normalization scheme used. In that case, the system determines which normalization scheme to use for each call number based on the MARC tag of the call number field. (If you have questions about your library's call number index or normalization scheme, contact the Help Desk.)

This section describes the special normalization rules that the system applies to indexing for LC, Dewey, and SUDOCS call numbers. It also describes character-by-character normalization, which may be applied to call numbers.

Character-by-Character Normalization

With this method, indexing starts at the beginning of the field or designated subfield. Each character is indexed according to the following set of rules:

  1. Both upper- and lowercase letters are treated as lowercase.
  2. In certain fields, groups of numbers that are eight digits or fewer at the beginning of a field, or three digits or fewer elsewhere, are right-justified, so that they file in numeric order. (If this were not done, "12" would file before "2" in the index.)
  3. Most punctuation is treated as a space so that it can be ignored. However, the '#' and '+' are indexed, and the '&' symbol (ampersand) is indexed as "and".
  4. Leading articles are ignored (based on MARC tag indicator).
  5. Multiple spaces are treated as a single space.
  6. Diacritics are ignored (e.g., accent marks).
  7. Non-displayable special characters (e.g., digraph Œ) are translated into equivalent letters (e.g., OE).

For example, the heading:

440 0 Series in psychosocial epidemiology ; |vv.7

is indexed as:

series in psychosocial epidemiology vbbb7

where b is a blank.

LC Call Number Normalization (with prestamps removed)

This rule is designed to find and handle the LC call number in the incoming data field. LC call numbers are defined as 1-3 alphabetic characters (the LC class letters) followed by 1-4 numeric characters (the LC class numbers). Data in the field is skipped until the LC class letters and numbers are encountered, thereby causing "prestamps" to be ignored.

Once the beginning of the LC call number has been identified, the call number is then normalized as follows:

For example:

 
1 2 3 4 5 6 7
PS345 normalizes to
p s     3 4 5

If a decimal value follows the class number, the decimal point and the number are appended immediately after the class number.

If there is any additional data consisting of letters followed without space by numbers, a space is inserted after the class number or decimal portion, if any, and then the additional data is inserted. When it is inserted, a blank space is added between each number portion and the next letter. For example:

 
                  1
 
1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
6 7
 
QA1765.9B32D45 normalizes to
q a   1 7 6 5 . 9   b 3 2   d
4 5
 

Any remaining characters in the call number are then normalized and appended. The normalization rules applied to these remaining characters are: replace punctuation with a single space; collapse multiple spaces to a single space; translate all characters to lowercase; numbers preceded by "v." or "no." take up 5 spaces and are right justified ending in the fifth position. For example:

INCOMING DATA NORMALIZED OUTPUT DATA
2
     
9 0
H1 B45 B3 1987
7
M452F237Q3
PS1323 H45 v.1
PS1323.5 H45 v.1
1
REF H1 B45
Fiction shelf
	  
                                  1
     
1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8
     
h           1   b 4 5   b 3   1 9 8
     
m       4 5 2  f 2 3 7   q 3
p s   1 3 2 3  h 4 5   v         1
p s      1 3 2 3 . 5     h 4 5    v
     
h           1   b 4 5
f i c t i o n   s h e l f
       (Not an LC call number)
Thesis Econ 1994
t h e s i s   e c o n   1 9 9 4
       (Not an LC call number)
Thesis Econ 1994 W4
w         4
       (LC call number found!)
Thesis EE 1994 W4
e e   1 9 9 4   w 4
       (LC call number found!)

NOTE

If no elements of the call number fit the LC pattern, the entire field is indexed character-by-character.

Dewey Call Number Normalization

Dewey normalization indexes the call number character-by-character. It retains all punctuation except the subfield delimiter, which it replaces with a single space, and it collapses multiple spaces to one space. Also, it changes letters to lowercase.

Examples of Dewey normalization:

INCOMING DATA NORMALIZED OUTPUT DATA


271.15 h22j
271.5 h22j
271.52 h22j
271.6 h22j
372 Av 27
372 Av27
971.81 .p86h
971.81 a42
	  
                  1
1 2 3 4 5 6 7 8 9 0 1 2
2 7 1 . 1 5   h 2 2 j
2 7 1 . 5   h 2 2 j
2 7 1 . 5 2   h 2 2 j
2 7 1 . 6   h 2 2 j
3 7 2   a v   2 7
3 7 2   a v 2 7
9 7 1 . 8 1   . p 8 6 h
9 7 1 . 8 1   a 4 2
NOTE

The final example above indicates problems that can occur with a library's inconsistent use of periods preceding cutters.

SUDOCS Call Number Normalization

All numbers in government document call numbers are normalized to five digits, right-justified. All punctuation is retained in the normalization. Multiple spaces are condensed to a single space. Letters are translated to lowercase. Subfield delimiters and letters other than the first are replaced with one space.

SUDOC normalization - example 1:

INCOMING DATA NORMALIZED OUTPUT DATA
    
     
5 6
|aGP 1.26:T 98/963
6 3
	  
                 1                   2
1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4
    
g p         1 .       2 6 : t       9 8 /     9
Notes on Example 1 call number:
first element   = gp<space>
second element   = 1. (number preceded by 4 spaces)
third element   = 26: (number preceded by 3 spaces)
fourth element   = t<space>
fifth element   = 98/ (number preceded by 3 spaces)
sixth element   = 963 (number preceded by 2 spaces)

SUDOC normalization - example 2:

Notes on Example 2 call number:
first element   = c<space>
second element   = 3. (number preceded by 4 spaces)
third element   = 2:c<space> (number preceded by 4 spaces)
fourth element   = 33/ (number preceded by 3 spaces)
fifth element   = 12/ (number preceded by 3 spaces)
sixth element   = 9189- (number preceded by 1 space)
seventh element   = 39 (number preceded by 3 spaces)

Call Number Normalization Examples

The examples below show the result of applying each rule to the following call number:

Call number = REF GP 1.26:T 98/963