Puneet Pant
Puneet Pant

Reputation: 1048

How to improve xdmp:document-filter() performance in Marklogic?

I am using xdmp:document-filter(doc()) to extract metadata from documents(doc, docx, pdf etc). We are using this because it works for all kinds of document format and generates the XHTML format for every kind of document. But the major drawback of this command is that it slows down the query. If there are one or two documents in the database then the query works fine but if there are more documents (e.g. 10 or 15) then the query slows down. We want to extract and show the information from the metadata of all the documents in the database.

We are using this query:-

for $d in fn:doc()
return xdmp:document-filter(doc(fn:base-uri($d)))

Is there any way to make this query work faster or is there any alternative to xdmp:document-filter() ?

Upvotes: 1

Views: 289

Answers (1)

grtjn
grtjn

Reputation: 20414

The xdmp:document-filter() is typically used at ETL time. If you use Information Studio to load your content, then you can add a 'Filter documents' transform. You can choose between storing the extracted metadata as separate xhtml documents, or as document properties. That way they don't need to be calculated on the fly at each request.

HTH!

Upvotes: 2

Related Questions