Reputation: 321
I am trying this marklogic spark connector tutorial. https://developer.marklogic.com/blog/marklogic-spark-example I was able to execute this. What I found is, it picks the documents database by default.
Question is:
Given code looks like this:
JavaPairRDD<DocumentURI, MarkLogicNode> mlRDD = context.newAPIHadoopRDD( hdConf, Configuration DocumentInputFormat.class, InputFormat DocumentURI.class, Key Class MarkLogicNode.class, Value Class );
I was wondering how I can pass the specific Document URI and Database to just get a specific document in a database. For Example; Documents database with xml files created on importing a csv file. Mentioned below: Marklogic : Multiple XML files created on document on importing a csv. How to get root Document URI path? Can some one share a sample code on how to pass the document URI and database name as parameters?
Upvotes: 2
Views: 219
Reputation: 11
If you refer to documentation for MarkLogic Connector for Hadoop, specifically Input Configuration Properties - You will find the property mapreduce.marklogic.input.documentselector which takes the XQuery path expression that allows you to select sepcific documents from the database.
Upvotes: 1
Reputation: 7770
The sample uses The Hadoop Connector.
Using MarkLogic 8, I believe you can set the database like this: com.marklogic.output.databasename in the job configuration.
http://docs.marklogic.com/guide/mapreduce/quickstart#id_38329
Upvotes: 0