Indexing d:content property with content 32 KB

Question

I have an Alfresco model type with an additional property of type d:content. This property causes Solr exceptions when I try to store content larger than 32 KB in it. The current definition of this property is


  d:content
  false
  
    true
    true
    both

If I put content larger that 32 KB into this property, Solr throws this exception when it tries to index it:

java.lang.IllegalArgumentException: Document contains at least one immense term in field="content@s____@{http://acme.com/model/custom/1.0}secondContent" (whose UTF8 encoding is longer than the max length 32766), all of which were skipped.  Please correct the analyzer to not produce such terms.

Changing the index configuration does not help, the error is thrown with all variants of index and the sub-elements I've tried.

In another question it is answered:

The maximum size for the a single term in the underlying Lucene index is 32776 bytes, which is I believe hard coded.

How do I configure the index of a d:content property so that I can save and index content larger than 32 KB?

Edit:

In contentModel.xml, cm:content is configured like this:


  true
  false
  true

Adding a simple text/plain file with content larger than 32 KB works without problems.

The same index configuration for my custom property still fails.

Update:

Under Alfresco 4.2fCE, the problem does not occur. So this is a bug in Alfresco 5.0c together with Solr 4.1.9.

Update 2:

I've filed a bug in the Alfresco JIRA.

Heiko Robert · Accepted Answer

The solution is not to store the full doc/part in the index. So try to avoid store=true and tokenize=both/false on large properties having > 32k. Indexing should work if your model declaration looks like:


  d:content
  false
  
    true
    false
    true

drawback: In my test I had to drop the whole index. I was not sufficient to delete the cached models in solr

Indexing d:content property with content > 32 KB

Answers (2)

Related Questions

Indexing d:content property with content &gt; 32 KB

Answers (2)

Related Questions

Indexing d:content property with content > 32 KB