Reputation: 1185
I am trying to index Wikipedia's dump. In order to provide abstract for the articles (or, maybe, enable highlighting feature in future) I'd like to store their text without WikiMarkup. For the first try, it would be enough for me to leave just alphanumeric symbols. So the question is it possible to store the field, that is filtered at character level, not the original one?
Upvotes: 3
Views: 1072
Reputation: 9964
There is no way to do this out of the box. If you want Solr to do this, you can create your own UpdateHandler, but this might be a little tricky. The easiest way to do this would be to pre-process the document before sending it to Solr.
Upvotes: 2
Reputation: 22555
Solr by default stores original field values before the filters are been applied by the index time analyzers for your fieldType. So by default it is not storing the filtered value. However you have two options for getting the result that you want.
Upvotes: 1