Comparing Imprint Data in Incoming and Existing Bibliographic Records
Innovative can configure your INN-Reach Catalog to compare the MARC 260 (Imprint) fields in incoming and master bibliographic records when matching on Secondary Match Fields. To change whether or not the system compares MARC 260 data when matching, your Central System Administrator must contact Innovative.
Your INN-Reach System can be configured to use either a "strict" or a "lenient" approach when:
If your INN-Reach Catalog is configured to compare MARC 260 data, INN-Reach does the following:
- Determines whether the potential matches contain MARC 260 fields:
- If neither of the records contains a MARC 260 field, or if one record contains a MARC 260 field, but the other does not, the system stops the imprint evaluation. If your INN-Reach System has been configured for additional evaluations (for example, title comparison), the system continues to the next evaluation. If no additional evaluations have been configured for your INN-Reach System, the system identifies the records as a match.
- If both records have MARC 260 fields, the system continues to the next step.
- Compares the MARC 260$c (Date of publication) subfields.
- Compares the MARC 260$a (Place of publication) subfields.
- Compares the MARC 260$b (Name of publisher) subfields.
For a description of the possible outcomes of this comparison, see Possible Outcomes of a MARC 260 Comparison.
Comparing the MARC 260$c (Date of Publication) Subfields
To compare the MARC 260$c subfields, the system:
- Determines whether it must evaluate the 260$c subfields, based on the Bibliographic Level (Leader byte 7) of the records:
Bibliographic Level Leader byte 7 System Action serial 's' The system skips to the 260$a subfield comparison. The system does not evaluate the 260$c subfields for serials records. other than serial not 's' The system continues to the next step in this comparison. - Determines whether the 260$c subfield is present in both the incoming record and the potential match:
- If the subfield is present in both records, the system continues to the next step in this comparison.
- If the subfield is absent from one or both of the records, the system skips to the 260$a subfield comparison. The absence of this subfield is treated as equivalent to the presence of a subfield containing matching data.
- Normalizes the 260$c data in the first instance of the MARC 260$c subfield in each record. The system can be configured to take either a strict or lenient approach to normalizing this data. For more information, see Normalizing MARC 260$c Subfields below.
This normalization process can result in an empty string. If the normalization process results in a non-empty string, it is considered "usable" for the purposes of comparison.
Based on the normalization results, the system continues as appropriate:- If data from both records normalizes to usable strings, the system continues to the next step in this comparison.
- If data from one or both records does not normalize to a usable string, the system skips to the 260$a subfield comparison. The absence of a usable string is treated as equivalent to the presence of a usable string containing matching data.
- Compares the 260$c data strings:
- If the strings match, the system continues to the 260$a subfield comparison.
- If the strings do not match, the imprint data does not match.
Comparing the MARC 260$a (Place of Publication) Subfields
To compare the MARC 260$a subfields, the system:
- Determines whether the 260$a subfield is present in both the incoming record and the potential match record:
- If the subfield is present in both records, the system continues to the next step in this comparison.
- If the subfield is absent from one or both of the records, the absence of the subfield is treated as equivalent to the presence of a subfield containing matching data. The system continues as appropriate based on which comparison type your INN-Reach System uses:
- STRICT: The system continues to the 260$b subfield comparison. This comparison type requires that both the 260$a and 260$b subfields match. Although this case is treated as equivalent to a match on 260$a, the system must evaluate the 260$b subfields.
- LENIENT: The imprint data matches. This comparison type allows for a match on 260$a only. Because this case is treated as equivalent to a match on 260$a, the system does not need to evaluate the 260$b subfields.
- Normalizes the data in the first instance of the 260$a subfield in each record. For more information on the normalizing process, see "Normalizing MARC 260$a and $b Subfields" below.
This normalization process can result in an empty string. If the normalization process results in a non-empty string, it is considered "usable" for the purposes of comparison.
Based on the normalization results, the system continues as appropriate:- If the data from both records normalizes to usable strings, the system continues to the next step in this comparison.
- If the data from one or both records does not normalize to a usable string, the absence of a usable string is treated as equivalent to the presence of usable string containing matching data. The system continues as appropriate based on which comparison type your INN-Reach System uses:
- STRICT: The system continues to the 260$b subfield comparison. This comparison type requires that both the 260$a and 260$b subfields match. Although this case is treated as equivalent to a match on 260$a, the system must evaluate the 260$b subfields.
- LENIENT: The imprint data matches. This comparison type allows for a match on 260$a only. Because this case is treated as equivalent to a match on 260$a, the system does not need to evaluate the 260$b subfields.
- Compares the 260$a strings:
Comparison Result STRICT Comparison Type LENIENT Comparison Type Strings match The system continues to the 260$b subfield comparison.
This comparison type requires that both the 260$a and 260$b subfields match, and this test has established only that the 260$a subfields match.The imprint data matches.
This comparison type allows either the 260$a subfields or the 260$b subfields to match. Because this test has established that the 260$a fields match, the system does not need to evaluate the 260$b subfields.Strings do not match The imprint data does not match.
This comparison type requires that the 260$a subfields match, and this test has established that they do not.The system continues to the 260$b subfield comparison.
This comparison type allows a match on either the 260$a subfields or the 260$b subfields. Even though this test has established that the 260$a subfields do not match, the system continues to evaluate the imprint data in the records.
Comparing the MARC 260$b (Name of Publisher) Subfields
To compare the MARC 260$b fields, the system:
- Determines whether the 260$b subfield is present in both the incoming record and the potential match:
- If the subfield is present in both records, the system continues to the next step in this comparison.
- If the subfield is absent from one or both records, the imprint data matches. The absence of this subfield is treated as equivalent to the presence of a subfield containing matching data.
- Normalizes the data in the first instance of the 260$b subfield in each record. For more information, see "Normalizing MARC 260$a and $b Subfields" below.
This normalization process can result in an empty string. If the normalization process results in a non-empty string, it is considered "usable" for the purposes of comparison.
Based on the normalization results, the system continues as appropriate:- If the data from both records normalizes to usable strings, the system continues to the next step in this evaluation.
- If the data from one or both records does not normalize to a usable string, the imprint data matches. The absence of a usable string is treated as equivalent to the presence of a usable string containing matching data.
- Compares the 260$b strings:
Comparison Result STRICT Comparison Type LENIENT Comparison Type Strings match The imprint data matches.
This comparison type requires that both the 260$a and 260$b subfields match, and tests have established that the requirements are met.The imprint data matches.
This comparison type allows either the 260$a subfields or the 260$b subfields to match. This test has established that the 260$b subfields match.Strings do not match The imprint data does not match.
This comparison type requires that the 260$b subfields match, and this test has established that they do not.The imprint data does not match.
This comparison type allows a match on either the 260$a subfields or the 260$b subfields. Tests have established that neither requirement is met.
Possible Outcomes of a MARC 260 Comparison
There are two possible outcomes of the MARC 260 comparison:
- Imprint Data Matches
- If the imprint data matches, the system stops the imprint evaluation and proceeds as appropriate:
- If your INN-Reach System has been configured for additional evaluations (for example, title comparison), the system continues to the next evaluation of the potential matches.
- If no additional evaluations have been configured for your INN-Reach System, the system identifies the records as a match.
- Imprint Data Does Not Match
- If the imprint data does not match, the system stops any further evaluation of the potential matches that your INN-Reach System might be configured to perform. It identifies the records as not a match.
Choosing a MARC 260 Comparison Type
The MARC 260 comparison type defines what attributes the 260 data must have for the records to be identified as matching. The STRICT and LENIENT comparison types are defined below.
Note that both comparison types use only the first instance of the MARC 260 field from the potential matches when performing the comparison.
STRICT Comparison Type
If your INN-Reach System is configured to use this comparison type, the records match if one of the following is true:
- The records contain non-parallel sets of MARC 260$a, $b, and $c subfields. For example, if Record 1 contains only a 260$a subfield, but Record 2 contains only a 260$b subfield, the absence of a subfield in one record is treated as equivalent to the presence of a subfield containing matching data.
- The records contain parallel sets of MARC 260$a, $b, or $c subfields (for example, both records have 260$a subfields), and all of the following conditions are true:
- Subfield 260$a exists in both and matches AND
- Subfield 260$b exists in both and matches AND
- Subfield 260$c exists in both and matches
LENIENT Comparison Type
If your INN-Reach System is configured to use this comparison type, the records match if one of the following is true:
- The records contain non-parallel sets of MARC 260$a, $b, and $c subfields. For example, if Record 1 contains only a 260$a subfield, but Record 2 contains only a 260$b subfield, the absence of a subfield in one record is treated as equivalent to the presence of a subfield containing matching data.
- The records contain parallel sets of MARC 260$a, $b, or $c subfields, and all of the following conditions are true:
- Subfield 260$a exists in both and matches OR
- Subfield 260$b exists in both and matches AND
- Subfield 260$c exists in both and matches
Normalizing MARC 260$a and $b Subfields
To normalize data from MARC 260$a and $b subfields, the system:
- Makes all characters in the string lower case.
- Strips punctuation.
- Strips leading English articles (for example, "a", "an", "the").
- Elides spaces.
- Strips data within square brackets ([ ]). If multiple subfields $a and $b are contained within a single set of brackets, the system strips data as if there were brackets around the individual subfields.
- Extracts and concatenates the first continuous sequence of four nonspace characters.
- Strips the entire data string if it normalizes to "sl" or "sn" only.
For example:
Example | Original Subfield Data | Normalized Subfield Data |
---|---|---|
1 | p260|aMaplewood, N.J. | mapl |
2 | p260|a[Maplewood, N.J.] New York | newy |
3 | p260|a[Maplewood, N.J.] | (empty) |
4 | p260|a[Maplewood, N.J.|aNew York, N.Y.|bHarper Collins] | (empty) |
5 | p260|asn | (empty) |
Normalizing MARC 260$c Subfields
To normalize data from MARC 260$c subfields, the system:
- Makes all characters in the string lower case.
- Strips punctuation.
- Strips leading English articles (for example, "a", "an", "the").
- Elides spaces.
- Strips any instance of a 'c' that is followed by a digit. For example, "p260 |cc1990" is normalized as "P260 |c1990".
- Strips or retains data within square brackets ([ ]) based on which normalization type your INN-Reach System is configured to use:
- STRICT: The system retains data in square brackets. For example:
Example Original Subfield Data Normalized Subfield Data 1 p260 |c1964, c1960] p260 |c1964 2 p260 |c[1964], c1960 p260 |c1964 - LENIENT: The system strips data in square brackets. For example:
Example Original Subfield Data Normalized Subfield Data 1 p260 |c1964, c1960] p260 |c 2 p260 |c[1964], c1960 p260 |c1960
- STRICT: The system retains data in square brackets. For example:
- Extracts the first contiguous sequence of numeric characters in the year format.
To be in the year format, the numeric sequence must contain four characters and must begin with a '1' or a '2'. If it begins with '1', the subsequent number in the sequence must be '6', '7', '8', or '9'. If it begins with a '2', the subsequent number in the sequence must be '0'.
If the system cannot identify a sequence in the year format, the string normalizes as an empty string.