Ashishkumar Chougule
Ashishkumar Chougule

Reputation: 121

lucene wildcard query with space

I have Lucene index which has city names. Consider I want to search for 'New Delhi'. I have string 'New Del' which I want to pass to Lucene searcher and I am expecting output as 'New Delhi'. If I generate query like Name:New Del* It will give me all cities with 'New and Del'in it. Is there any way by which I can create Lucene query wildcard query with spaces in it? I referred and tried few solutions given @ http://www.gossamer-threads.com/lists/lucene/java-user/5487

Upvotes: 8

Views: 10764

Answers (2)

Nrj
Nrj

Reputation: 6831

As I learnt from the thread mentioned by you (http://www.gossamer-threads.com/lists/lucene/java-user/5487), you can either do an exact match with space or treat either parts w/ wild card.

So something like this should work - [New* Del*]

Upvotes: 0

femtoRgon
femtoRgon

Reputation: 33341

It sounds like you have indexed your city names with analysis. That will tend to make this more difficult. With analysis, "new" and "delhi" are separate terms, and must be treated as such. Searching over multiple terms with wildcards like this tends to be a bit more difficult.

The easiest solution would be to index your city names without tokenization (lowercasing might not be a bad idea though). Then you would be able to search with the query parser simply by escaping the space:

QueryParser parser = new QueryParser("defaultField", analyzer);
Query query = parser.parse("cityname:new\\ del*");

Or you could use a simple WildcardQuery:

Query query = new WildcardQuery(new Term("cityname", "new del*"));

With the field analyzed by standard analyzer:

You will need to rely on SpanQueries, something like this:

SpanQuery queryPart1 = new SpanTermQuery(new Term("cityname", "new"));
SpanQuery queryPart2 = new SpanMultiTermQueryWrapper(new WildcardQuery(new Term("cityname", "del*")));
Query query = new SpanNearQuery(new SpanQuery[] {query1, query2}, 0, true);

Or, you can use the surround query parser (which provides query syntax intended to provide more robust support of span queries), using a query like W(new, del*):

org.apache.lucene.queryparser.surround.parser.QueryParser surroundparser = new org.apache.lucene.queryparser.surround.parser.QueryParser();
SrndQuery srndquery = surroundparser.parse("W(new, del*)");
query = srndquery.makeLuceneQueryField("cityname", new BasicQueryFactory());

Upvotes: 9

Related Questions