Reputation: 31
I am unable to search content with special characters(?*)
and mathematical symbols(÷)
etc in marklogic
If I search for content with ÷
, I do not get any results.
localhost:9000/v1/search?q=divide÷&collection=Math&options=searchmath&format=xml
Content: divide÷
I am using index search in the element and looks like marklogic is not indexing ÷
symbol.
Any ideas why marklogic is not returning proper response having elements with special characters?
Upvotes: 3
Views: 652
Reputation: 4912
The ÷
character is indexed as punctuation, which is to say, it is not indexed at all. If you look in tokenizer.xml you can see the classification of characters in various character ranges for the purposes of tokenization. You can define a tokenizer override on your field if you need to have this character be indexed.
However, I would expect false positives rather than false negatives in this case. It might be useful to get the query plan and make sure that the character is making it through the layers of URL encoding and REST etc. properly.
As for ?
and *
-- these are wildcard characters, so you have to make sure your query is unwildcarded. Again, in a a non-wildcard query these are punctuation marks, and not indexed -- you can only get accurate searches by doing filtered searches or exact value queries.
Upvotes: 6