Solr regex query global flag

Question

I want to run the following regex query in solr name:/.+\.m+d$/. I have documents in my index with the following names:

readme.md
2013.02.26.md
test.mmd

and none of them match. Removing the $ matches the readme.md entry. I believe the problem is that I need to specify a global pattern modifier but can't find the syntax to do this.

arun · Accepted Answer

These are my observations based on experimenting with Solr regex matches:

Do HTML percent encoding of all the special characters in your regex. This site has been helpful for doing the percent encoding manually.
Make sure you do regex matching on string fields if you want to match the entire value. Regex matching on text fields will involve tokenization and will work according to which tokens got produced during indexing.
For solr regexes don't specify the beginning anchor ^ or the end anchor $, since it always assumes you are matching against the entire string. Unless you specify a .* or .+ (or some such regex) at the beginning or the end, it is always a match with ^ in the beginning and $ at the end.

I just indexed the 3 values in your question in a string field and issued this query and it matches all the 3 documents:

q=id:/.%2B%5C.m%2Bd/

The PCRE of .%2B%5C.m%2Bd is .+\.m+d$.

Solr regex query global flag

Answers (2)

Related Questions