Change Bibliographic or Authority Deduplication Tables

The Bibliographic Deduplication and the Authority Deduplication database tables contain the rules that determine if a bibliographic or authority record is a duplicate of an existing record. The rules are applied in groups to more efficiently determine a duplicate. Two records must meet all the rules in at least one group to be potential duplicates.

When records are checked for duplicates, the rule group name displays in the duplicate detection dialog as the duplicate reason. You can rename the group so that the reason for the duplicate is clear. For example, if the rule group name is 245 $a matches 245 $a you could rename it Matching Title.

Note:
These permissions are required to view and change the rule groupings in the deduplication tables: Access administration: Allow; Access tables: Allow; Modify bibliographic deduplication table: Allow; Modify authority deduplication table: Allow

For details on the rules and how they are applied, see Bibliographic Duplicate Detection Rules Reference or Authority Duplicate Detection Rules Reference.

To create a new rule group or modify an existing one

  1. In the Administration Explorer tree view, expand the organization’s folder.
  2. Expand the Database Tables folder.
  3. Select the Bibliographic Deduplication or Authority Deduplication table to display the appropriate table in the details view.

    Note:
    When you first implement Polaris, a default set of rule groups appears. You can add or remove rules from a group, change the rule group name, add new rule groups, or delete a group.

    bibdedup.gif 

  4. Do one of the following:

    • To create a new rule group, select adrlegrpbtn.gif.
    • To modify an existing group, select the group, and select btnModify.gif.

    The Create or Modify Deduplication Rules dialog appears.

  5. Do one of the following:

    • To add a rule to the group, select the rules from the Available Rules list, and choose Select.

      Tip:
      If you made changes to the group and you want to return the group to its state when you opened the dialog, select Reset.

    • To remove a rule from the group, select the rule in the Selections for Rule Group list, and select Remove.
    • To name or rename the rule group, type the name in the Rules Group box.
  6. When you finish changing the current group, select OK.
  7. Select File > Save.

Bibliographic Duplicate Detection Rules Reference

Two bibliographic records are identified as potential duplicates based on the rule groups in the Bibliographic Deduplication table. Each rule group contains one or more rules. Two bibliographic records must meet all the rules in at least one group to be potential duplicates.

Note:
When you select custom duplicate detection rules in an import profile, each rule is applied as a group, so only one rule needs to be met to identify the records as duplicates. See Set import options for bibliographic records.

Rule

Comment

The 001 of the imported record matches the 010 $a of an existing record.  

The 001 tag of the imported record matches the 035 $a of an existing record.

Applies only to imported records. The 001 tag must match the numeric portion of the 035 tag (normalized). The parenthetical information in the 035 tag is ignored.

The 001/003 tags of the imported record matches any 035 $a subfield in the existing record.

Applies only to imported records. The 001 tag must match the numeric portion of the 035 tag (normalized). If the incoming record contains a 003 tag, the 003 data must match the parenthetical information in the 035 tag. If the incoming record lacks an 003 tag, the parenthetical information in the 035 tag is ignored.

The 010 $a subfield of the incoming record matches the 010 $a subfield in the existing record.

For LCCNs, the prefix, year, and serial number (the first 12 characters) must match. Suffixes and revision dates are ignored.

The ISBN in the incoming record matches any ISBN in the existing record.

The alphanumeric ISBN must match. Parenthetical and other information is ignored. This rule is applied to any 020 tag and also to any 024 tag where the first indicator = 3.

Any 022 $a subfield in the incoming record matches any 022 $a subfield in the existing record.

The 8-digit alphanumeric ISSN must match. Parenthetical and other information is ignored.

The LDR/06 value in the incoming record matches the LDR/06 in the existing record.

The value in the Record Type position of the Leader must match exactly.

The LDR/07 value in the incoming record matches the LDR/07 in the existing record.

The value in the Bibliographic Level position of the Leader must match exactly.

The 1xx $a subfield in the incoming record matches the 1xx $a subfield in the existing record.

The entire text of $a subfield of both records must match. Punctuation is ignored. Tag and indicator values need not match.

The 245 $a subfield in the incoming record matches the 245 $a subfield in the existing record.

The entire text of $a subfield must match. Punctuation is ignored. Indicator values need not match.

The 245 $a subfield in the incoming record matches any 246 $a subfield in the existing record.

The entire text of $a subfield must match. Punctuation is ignored. Indicator values need not match.

Any 246 $a subfield in the incoming record matches the 245 $a subfield of the existing record.

The entire text of $a subfield must match. Punctuation is ignored. Indicator values need not match.

Any 247 $a subfield in the incoming record matches the 245 $a subfield in the existing record.

The entire text of $a subfield must match. Punctuation is ignored. Indicator values need not match.

The 008/07-10 values in the incoming record match the 008/07-10 in the existing record.

The date in the Beginning Date of Publication must match exactly in both records.

The last 260 $c subfield in the incoming record matches the last 260 $c subfield in the existing record.

The entire text of the $c subfield must match in both records. Punctuation is ignored.

The 035a of the incoming record matches the 035 $a of an existing record.

 

Any 035 $a subfield in the incoming record matches the 001 tag in the existing record.

This rule should be enabled only when a library is importing records that have been previously exported from the same Polaris database, when the intent is to have the incoming record replace the existing record.

The Bib record owner value in the incoming record matches the value in the existing record.

 

The UPC of the incoming record matches the UPC of the existing record.

This rule is applied to any 024 tag where the first indicator value =1.

The 024 $a (excluding ISBN and UPC) of the incoming record matches the 024 $a (excluding ISBN and UPC) of an existing record.

This rule is applied to any 024 tag where the first indicator value is other than 1 or 3.

The 028 $a of the incoming record matches the 028 $a of an existing record.

 

The 037 $a of the incoming record matches the 037 $a of an existing record.

 

The 001 of the imported record matches the 001 of an existing record. Applies only to imported records.

Authority Duplicate Detection Rules Reference

Two authority records are identified as potential duplicates based on the rule groups in the Authority Deduplication table.

Rule

Comment

The 001 tag of the imported record matches the 010 $a subfield in the existing record.

Applies only to imported records. For LCCNs, the prefix, year, and serial number (the first 12 characters) must match; suffixes and revision dates are ignored.

The 001/003 tags of the imported record matches any 035 $a subfield in the existing record.

Applies only to imported records. The 001 tag matches the numeric portion of the 035 tag (normalized). If the incoming record contains a 003 tag, the 003 data matches the parenthetical information in the 035 tag; if no 003 tag, the information in the 035 tag is ignored.

The 035 $a subfield of the incoming record matches any 035 $a subfield in the existing record.

 

The 010 $a subfield of the incoming record matches the 010 $a subfield in the existing record.

For LCCNs, the prefix, year, and serial number (the first 12 characters) must match. Suffixes and revision dates are ignored.

Any 035 $a subfield of the incoming record matches the 001 tag in the existing record.

Library is importing records that have been previously exported from the same Polaris database, when the intent is to have the incoming record replace the existing record.

The 008/11 tag of the incoming record matches the 008/11 of the existing record.

Matches 008 subject heading system code of both records.

The 1xx tag of the incoming record matches the 1xx of the existing record.

Matches entire text of the 1xx tag (concatenation of all data fields).

The 010 $z of the incoming record matches the 010 $a of the existing record.

 

Related Information

You can specify duplicate detection rules to use when importing records. See Setting Up Import Profiles.