Gatt
Gatt

Reputation: 21

MarkLogic - Expanded tree cache error while inserting documents

In our application, we receive batches of files that need to be preprocessed and loaded into MarkLogic.

To do this, we need to:

  1. Load the files in a temporary MarkLogic working directory
  2. Preprocess (node operations on the XML file)
  3. Move documents to their intended destination MarkLogic directory using xdmp:document-insert

While doing (3), we get expanded tree cache error, for a batch of 1500 docs (upto 400 docs, it works ok. Any higher number, the error pops up).

The algorithm steps for our code:

    Get total number of docs in working directory = totalRec
    for Ctr = 1 to totalRec
        Get specific node values for current doc
        Frame the target URI where doc is to be loaded
        Insert document using xdmp:document-insert

We even tried using transaction begin/commits within the for loop but nothing seems to work. Any thoughts on how to fix this issue?

Upvotes: 2

Views: 368

Answers (1)

mblakele
mblakele

Reputation: 7842

An expanded tree cache error simply means that you're trying to work with too many fragments at once. The preferred solution is to reduce the size of the working set. Usually less good, if you have enough spare memory you can increase the size of the expanded tree cache in the group settings. But it is usually better to reduce the size of the working set.

This particular use-case sounds like a content processing workflow. So you might be better off using a built-in product feature, the Content Processing Framework (CPF): http://docs.marklogic.com/guide/cpf has more about CPF.

Or an InfoStudio flow might be appropriate: http://docs.marklogic.com/guide/infostudio

Using an existing tool means that you don't have to reinvent the wheel.

Upvotes: 3

Related Questions