artemis_clyde
artemis_clyde

Reputation: 373

Solr doesn't index blob files

I am using the Collective Solr 4.1.0 Search on our Plone 4.2.6 system.

My issue now is the following. On our Plone Server, we use ZODB for creating blob files. When I now try to build an index in Solr, I get the error INFO UniCMSData MISSING BLOB FILE: /opt/plone/data/blobstorage/0x31/0x37/0x32/0x36/0x39/0xa2/0xce/0x3e/0x03b3d7af6465c4cc.blob.

The path looks proper to me and I know that unter opt/plone/data/blobstorage all blob files are stored. That makes me wonder: Did Solr not find it or is just something awfully wrong with my Solr Configuration? Or does Solr not handle ZODB blobs properly?

Greatful for every little help :)

Upvotes: 3

Views: 311

Answers (1)

Mathias
Mathias

Reputation: 6839

It's me again :-)

No nothing is wrong with yout solr configuration.

Solr ships with apache tika by default, which can convert nearly everything to text/plain.

But you need at least collective.solr 5.0.1, because with this version it's possible to extract the searchable text directly from the blob using the collective.solr BinaryIndexer.

If you're unable to upgrade your plone site / collective.solr you may install ftw.tika --> https://pypi.python.org/pypi/ftw.tika/2.7.0

ftw.tika registers a plone portal_transforms, which uses tika to convert many types to plain/text. You can run tika also as service.

ftw.tika is Plone 4.2 compatible.

Upvotes: 3

Related Questions