Reputation: 21
I have the requirement to insert 10,000 docs into marklogic in less than 10 seconds.
I tested in one single-node marklogic server in the following way:
xdmp:spawn
to pass the doc insertion task to task server;xdmp:document-insert
without specify forest explicitly;CPF
.The performance is very bad: it took 2 minutes to finish the 10,000 doc creation. I'm sure the performance will be better if I tested it in a cluster environment, but I'm not sure whether it can finish in less than 10 seconds.
Please advise the way of improving the performance.
Upvotes: 2
Views: 343
Reputation: 61
Assuming 2 socket server, 128GB-256GB of ram, fast IO (400-800MB/sec sustained)
Turn on perf history, look in metrics, and you will see where the bottleneck is.
SSD is not required - just IO throughput...which multiple spinning disks provide without issue.
Upvotes: 1
Reputation: 20414
If you need a fast load, I wouldn't use xdmp:spawn for each individual document, nor use CPF. But 2 minutes for 10k docs doesn't necessarily sound slow. On the other hand, I have reached up to 3k/sec, but without range indexes, transforms, whatsoever. And a very fast disk (e.g. ssd)..
HTH!
Upvotes: 1
Reputation: 7840
I would start by gathering more information. What version of MarkLogic is this? What OS is it running on? What's the CPU? RAM? What's the storage subsystem? How many forests are attached to the database?
Then gather OS-level metrics, to see if one of the subsystems is an obvious bottleneck. For now I won't speculate beyond that.
Upvotes: 1