Reputation: 1133
I export huge pdf files some of the pdf over 1GB and also reduce thread_count 4. What's else do I need to do to avoid timeout. Thanks
ERROR contentpump.DatabaseContentReader: RuntimeException reading /pdf/docIns/docIns-
222581.pdf :com.marklogic.xcc.exceptions.StreamingResultException: RequestException
instantiating ResultItem 301805: Time limit exceeded
22/01/24 17:48:09 INFO contentpump.DatabaseContentReader: host name: xxx.us-
central.compute.internal
22/01/24 17:48:09 INFO contentpump.DatabaseContentReader: Retrying connect
22/01/24 17:53:16 INFO contentpump.LocalJobRunner: completed 3%
Upvotes: 0
Views: 165
Reputation: 3609
Thread count won't make a difference as each doc can only be read by one thread concurrently. The limiting factor is either network transfer time or time to read the file off MarkLogic's disk and into available memory (or some combination of these factors).
You could try grabbing the document over REST (/v1/documents/ endpoint) and see if that is quicker. You could also use xdmp:zip-create
to try and compress it within MarkLogic and see if downloading the compressed file is fast enough.
Alternatively, consider using MarkLogic to store a URL alongside the searchable (meta)data to grab the document from something else (like a CDN or S3 for example).
Upvotes: 1
Reputation: 7770
You could consider increasing the request time limit of the http server. This page explains the settings: https://docs.marklogic.com/admin-help/http-server
If you are managing your cluster via the REST API, you can look here: https://docs.marklogic.com/REST/POST/manage/v2/servers
Also .. there are other options for large Binary content.. you could also consider storing the PDF as a registered binary on a location with external access for clients such as S3.. then just return the reference and your clients could get the file directly assuming that they have credentials to read from the storage. For projects before, I have served large binary from S3 and other times from a different type of server as a proxy using a 1 time token.
Upvotes: 1