Reputation: 193
Is it possible to harvest single items from other repositories with DSpace? Perhaps from command line? As far as I can see, with XMLUI only harvesting complete communities or complete collections is possible. But then I get mostly too many items I don't need.
Upvotes: 2
Views: 1174
Reputation: 193
As Terry wrote you can harvest a single item/document from a repositry with a GetRecord request. With the DSpace menu-item 'Batch Import (ZIP)' item(s) can be imported, if the content of the zip has a specific format.
The following PHP code extracts the metatdata from the by GetRecord created XML. In the next step this metadata is packed in XML-format that DSpace understands. This XML is added as file (dublin_core.xml) to the created ZIP, together with a small file (handle) containing the handle. Finally the ZIP is written to the server.
BTW Importing the zip-file can also be done from commandline, as Terry mentioned in his first answer.
<?php
// handle and harvest-string
$handle = "1874/1506";
$harvest = "http://dspace.library.uu.nl/oai/request?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:dspace.library.uu.nl:" . $handle;
// get XML from source repository
$sxe = simplexml_load_file($harvest, "SimpleXMLElement");
// add namespace schema-urls
$sxe->registerXPathNamespace('oai_dc', 'http://www.openarchives.org/OAI/2.0/oai_dc/');
$sxe->registerXPathNamespace('dc', 'http://purl.org/dc/elements/1.1/');
// get Dublin Core (dc) elements from the XML
foreach($sxe->xpath("//oai_dc:dc") as $entry) {
$child = $entry->children('dc', true);
}
// add dc-elements (names and values) to array
foreach($child as $elementName => $elementValue) {$elements[$elementName][] = $elementValue;}
// create zip-object and -file
$zip = new ZipArchive();
$zip->open("doc/importZip.zip", ZipArchive::CREATE);
// create a directory in the zip-object
$zip->addEmptyDir("item");
// create Dublin Core XML object
$oXML = new DOMDocument();
$oXML->encoding = "UTF-8";
$oXML->formatOutput = true;
$oXML->xmlStandalone = false;
$oRoot = $oXML->createElement('dublin_core');
$oRoot->setAttribute('schema', 'dc');
$oXML->appendChild($oRoot);
// add elements and their values to XML object
foreach($elements as $elementName => $elementValues) {
foreach($elementValues as $elementValue) {
$oDcValue = $oXML->createElement('dcvalue');
$oDcValue->setAttribute('element', $elementName);
$oText = $oXML->createTextNode($elementValue);
$oDcValue->appendChild($oText);
$oRoot->appendChild($oDcValue);
}
}
// save created XML to string
$dublinCoreXml = $oXML->saveXML();
// add XML-string as file to zip-object
$zip->addFromString("item/1/dublin_core.xml", $dublinCoreXml);
// add handle as file to zip-object
$zip->addFromString("item/1/handle", $handle);
$zip->close();
?>
Upvotes: 2
Reputation: 3956
The OAI-PMH standard provides a method GetRecord.
https://knb.ecoinformatics.org/knb/docs/oaipmh.html
If you navigate the set containing your item of interest, you should be able to find the item's identifier. You can use that identifier as a parameter to GetRecord.
This would allow you to extract the item metadata. In order to get the item into DSpace, I imagine that you would need to package the item for ingest into the repository.
Upvotes: 2
Reputation: 3956
If you are looking to pull a single item via the command line, consider the packager command.
https://wiki.duraspace.org/display/DSDOC5x/Importing+and+Exporting+Content+via+Packages
Upvotes: 1