Reputation: 130
Is there a way to copy/move data from Marklogic Server to Amazon S3? I don't want all data to be moved, but certain documents pertaining to a particular collection or some other logic. I can do xdmp:save() and that works for few thousand documents, but I have got few million records and this method won't work out well in that case, is there a better and robust way that can be used to copy data over? Can I use MLCP for this or use a spawn module to run it over a task server and get this work done? I am running on ML-8 hosted on AWS.
Any suggestion would help immensely.
Regards Amit
Upvotes: 0
Views: 703
Reputation: 3732
you can use the backup feature and set the target directory to s3://bucket/path
Upvotes: 1
Reputation: 130
I used mlcp export for making the change, and it works quite well with the collection filter and does the trick for me. I have not tried the CORB2 yet, but will give it a try as well when time permits
mlcp export -host {host} -port {port} -username {username} -password {password} -output_file_path {S3 path} -collection_filter {collection name to be moved}
Upvotes: 0
Reputation: 7770
I would use Corb2 to facilitate the xdmp:save() command since s3:// is a built-in file-system. Any solution with MLCP would suffer more data transfer and I am not sure of the value unless you also want an archive (which is a valid point if you want to preserve properties, permissions, collections, etc)
Second to that - I have never done it, but I understand that you can use S3 as the location of a forest. In that case, you could balance certain documents to a forest located on S3.
Upvotes: 1
Reputation: 437
Retrieve the documents from MarkLogic using REST API and pipe the output to aws command to upload to AWS S3 bucket:
curl --anyauth --user user:password -X GET -H "Content-type: application/xml" http://localhost:8052/LATEST/documents?uri=/docs/test.xml | aws s3 cp - s3://yourbucket/test.xml
Upvotes: 0