Reputation: 561
I just found out, that solr server can find words which are in a given distance to another word like this:
text_original : "word1 word2"~10
So solr is searching for word1 which has a word2 in a maximal distance of 10 words around.
great, YAY
but now I just want to do the same just with some undifined numbers. I just want to have a look for numbers which occure in a given range os some keywords. As a regex I would write something like that:
myWord(\s)+(([A-Za-z]+)\s){0,10}([0-9]{3,12}(\.|\,)[0-9]{1,4})
or something like that.
So I thought it would be easy in solr to do it similar to words in a range:
text_original: Word1 /[0-9]{3,12}/~10
But yes, the both terms are now linked with OR, so I find numbers OR my given word. But i can't use quotation because the regex won't work then.
Can anyone please leave me a hint in which constellation this search terms have to be, that it works like described?
Upvotes: 0
Views: 414
Reputation: 33341
You can do this through the ComplexPhraseQueryParser, with a query like:
text_original:"Word1 /[0-9]{3,12}/"~10
Keep in mind, that a regex query in lucene must match the whole term, so this would not match "word1 word2", but it would match "word1 extra stuff 20". Slop also seemed a bit odd in my testing.
You could do it if you are willing to fall back on writing a raw lucene query, you can also accomplish it using the SpanQuery
API, such as:
SpanQuery wordQuery = new SpanTermQuery(new Term("text_original", "Word1"));
SpanQuery numQuery = new SpanMultiTermQueryWrapper(new RegexpQuery("text_original", "[0-9]{3,12}"));
Query proxQuery = new SpanNearQuery(new SpanQuery[] {wordQuery, numQuery}, 10, false);
searcher.search(proxQuery, numHits);
Upvotes: 1