Reputation: 23
I have a CSV file in my local folder. I want to load that file into MarkLogic DB using CoRB into a specified Collection. Can you please help?
Upvotes: 1
Views: 219
Reputation: 66714
You would probably want to configure your job to use the URIS-FILE option pointing to your CSV. CORB will read the file and send each of the lines from the CSV to the configured PROCESS-MODULE as the $URI
value to be processed.
The properties file would look something like this:
# how to connect to to the XCC server
XCC-CONNECTION-URI=xcc://user:password@localhost:8202/
# path to the CSV file to be processed
URIS-FILE=input-uris.csv
# the module that will process and save each CSV row
PROCESS-MODULE=save-row.xqy|ADHOC
# how many threads to use to execute process modules
THREAD-COUNT=10
In your process module, you would need to declare an external variable called $URIS
, and then tokenize on your CSV row by the delimiter and process the columns of data. Invoke xdmp:document-insert()
to insert the document and specify the collection(s) that you want the document to be in:
xquery version "1.0-ml";
declare variable $URI as xs:string external;
let $columns := fn:tokenize($URI, ",\s?")
(: assuming that the first column has a unique value to be used for the URI :)
let $uri := $columns[1]
(:do whatever processing you would need to generate the document from CSV columns :)
let $content := $columns[2]
return
xdmp:document-insert($uri,
$content,
map:map() => map:with("collections", "mySpecialCollection")
)
Note: the signature for xdmp:document-insert() has recently changed. You now specify xdmp:document-insert
options, such as permissions and collections, in either a map or an options element in the third parameter. In prior MarkLogic versions, permissions and collections were the third and fourth parameters. Adjust the call to xdmp:document-insert()
according to the version of MarkLogic that you are using (there is a dropdown on the top left side of the documentation to select your version).
Upvotes: 1