Reputation: 178
I have millions of documents in different collections in my database. I need to export them to a csv onto my local storage when I specify the collection name.
I tried mlcp export but didn't work. We cannot use corb for this because of some issues.
I want the csv to be in such a format that if I try a mlcp import then I should be able to restore all docs just the way they were.
Upvotes: 0
Views: 580
Reputation: 2236
ml-gradle has support for exporting documents and referencing a transform, which can convert each document to CSV - https://github.com/marklogic-community/ml-gradle/wiki/Exporting-data#exporting-data-to-csv .
Unless all of your documents are flat, you likely need some custom code to determine how to map a hierarchical document into a flat row. So a REST transform is a reasonable solution there.
You can also use a TDE template to project your documents into rows, and the /v1/rows endpoint can return results as CSV. That of course requires creating and loading a TDE template, and then waiting for the matching documents to be re-indexed.
Upvotes: 1
Reputation: 20414
My first thought would be to use MLCP archive feature, and to not export to a CSV at all.
If you really want CSV, Corb2 would be my first thought. It provides CSV export functionality out of the box. It might be worth digging into why that didn't work for you.
DMSDK might work too, but involves writing code that handles the writing of CSV, which sounds cumbersome to me.
Last option that comes to mind would be Apache NiFi for which there are various MarkLogic Processors. It allows orchestration of data flow very generically. It could be rather overkill for your purpose though.
HTH!
Upvotes: 3