Chris
Chris

Reputation: 29742

Solr search with AND operator with strict ordering

Let's say I have a query like:

search_field:A AND search_field:B

that looks for a target that contains both A and B

so the result would be:

AcccccB
BcccccA
...

However, is there a way to keep the ordering of the query so it works like with an order?

For example with pseudo query

search_field:A AND THEN search_field:B

which would yield

AcccccB
...

The logic is that based on the query, it looks for A and B but with that order only. So while BcccccA had both A and B, it was filtered out since B came before A.

Tried

AcccccB
Acc ccB
Bcc Acc < can't filter out

Thank you in advance let me know if I can make the question more clear.

Upvotes: 1

Views: 120

Answers (2)

MatsLindh
MatsLindh

Reputation: 52822

If the distance between A and B is less than or 99 positions - and they're tokens by themselves and not part of another token, you can use the surround query parser:

q={!surround}99w(A,B)

99 is the maximum distance, w means ordered search (n would mean unordered).

Upvotes: 1

user9712582
user9712582

Reputation: 1683

1. string: IF the value of search_field is stored as one token (string field type), THEN you may be able to use a wildcard pattern or a regular expression to match the value. To match single-token string type fields, where an A appears before a B:

q=search_field:*A*B*

or

q=search_field:/.*A.*B.*/

For more, see this Solr Regex Tutorial. In the tutorial example the same value is stored twice, once in a string field and once in a text field.

An example of this in the Solr "techproducts" example data is the field pair: author (text) and author_s (string). Order is siginificant: The query q=author_s:*g*t* returns books by George R.R. Martin, and the query q=author_s:*t*g* returns a book by Grant Ingersoll.

2. text: IF the value of search_field is indexed as multiple tokens (such as when each word is a token), and A is in a separate token from B, THEN you may be able to use the Complex Phrase Query Parser with inOrder=true (default).

2a. text, adjacent tokens: IF A and B must appear in adjacent tokens in the field value, THEN a complex phrase query with no ~ proximity can be used:

{!complexphrase}search_field:"*A* *B*"

{!complexphrase}search_field:"/.*A.*/ /.*B.*/"

Adjacency example: In the "techproducts" sample data, {!complexphrase}author:"*t* *g*" does return the book by Grant Ingersoll, but {!complexphrase}author:"*g* *t*" does not return the books by George R.R. Martin.

2b. text, nearby: IF the tokens are not necessarily adjacent but are nearby, THEN use a complex phrase query, suffixed with a ~ proximity token count. For example, within 10 words or tokens:

{!complexphrase}search_field:"*A* *B*"~10

{!complexphrase}search_field:"/.*A.*/ /.*B.*/"~10

In the "techproducts" sample data, {!complexphrase}author:"*g* *t*"~10 does return the books by George R.R. Martin and not the book by Grant Ingersoll.

Note: neither 2a nor 2b will match single token values where A is followed by B. To also include single-token value matches, specify the OR of a single token pattern and a multiple token pattern:

{!complexphrase} search_field:*A*B* OR search_field:"*A* *B*"~10

Upvotes: 2

Related Questions