Harvesting XML Metadata
If your organization has acquired XML Harvester and a load table, you can:
- harvest XML metadata from external repositories
- convert the metadata to bibliographic MARC records
- load the records into your Innovative system
Additional load tables can be purchased. Innovative recommends purchasing one load table for every repository you harvest. See Editing the XML Harvester Configuration File.
Metadata are structured elements that describe an information resource or, more generally, any definable entity. The bibliographic record data can be considered metadata. Harvesting refers to the process of collecting metadata from a server. Servers designed to provide metadata for harvest are called repositories.
The standard used by XML Harvester is the Open Archives Initiative Protocol for MetaData Harvesting
The XML Harvester process works as follows:
- You edit the configuration file to specify the:
- format of the request
- mapping of the metadata to MARC
- URL of the repository
- additional harvesting parameters
- Harvest the repository.
- XML Harvester converts records into a file that can be loaded into the system by using a standard load table.
- Load the records into the Innovative system.
Metadata and Metadata Request Formats
The XML Harvester-supported formats for metadata and metadata requests are:
- MARCXML
- Unqualified Dublin Core
- other XML formats that use MARC encoding analogs
Specify the metadata request format by editing the OAIFORMAT trigger in the configuration file.
Specify the metadata format by editing the XML_TYPE trigger in the configuration file.
Specifying a Repository of XML Records
The XML Harvester is OAI-PMH compliant. Innovative recommends using the repositories that conform to the OAI-PMH standard.
Specify the repository by editing the URL trigger in the configuration file.