Reputation: 4465
I try to understand how the lucene query syntax works so I wrote this small program. When using a NumericRangeQuery I can find the documents I want but when trying to parse a search condition, it can't find any hits, although I'm using the same conditions. i understand the difference can be explained by the analyzer but the StandardAnalyzer is used which does not remove numeric values.
Can someone tell me what I'm doing wrong ? Thanks.
package org.burre.lucene.matching;
import java.io.IOException;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.*;
import org.apache.lucene.index.*;
import org.apache.lucene.queryparser.classic.ParseException;
import org.apache.lucene.queryparser.classic.QueryParser;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.NumericRangeQuery;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.ScoreDoc;
import org.apache.lucene.store.*;
import org.apache.lucene.util.Version;
public class SmallestEngine {
private static final Version VERSION=Version.LUCENE_48;
private StandardAnalyzer analyzer = new StandardAnalyzer(VERSION);
private Directory index = new RAMDirectory();
private Document buildDoc(String name, int beds) {
Document doc = new Document();
doc.add(new StringField("name", name, Field.Store.YES));
doc.add(new IntField("beds", beds, Field.Store.YES));
return doc;
}
public void buildSearchEngine() throws IOException {
IndexWriterConfig config = new IndexWriterConfig(VERSION,
analyzer);
IndexWriter w = new IndexWriter(index, config);
// Generate 10 houses with 0 to 3 beds
for (int i=0;i<10;i++)
w.addDocument(buildDoc("house"+(100+i),i % 4));
w.close();
}
/**
* Execute the query and show the result
*/
public void search(Query q) throws IOException {
System.out.println("executing query\""+q+"\"");
IndexReader reader = DirectoryReader.open(index);
try {
IndexSearcher searcher = new IndexSearcher(reader);
ScoreDoc[] hits = searcher.search(q, 10).scoreDocs;
System.out.println("Found " + hits.length + " hits.");
for (int i = 0; i < hits.length; ++i) {
int docId = hits[i].doc;
Document d = searcher.doc(docId);
System.out.println(""+(i+1)+". " + d.get("name") + ", beds:"
+ d.get("beds"));
}
} finally {
if (reader != null)
reader.close();
}
}
public static void main(String[] args) throws IOException, ParseException {
SmallestEngine me = new SmallestEngine();
me.buildSearchEngine();
System.out.println("SearchByRange");
me.search(NumericRangeQuery.newIntRange("beds", 3, 3,true,true));
System.out.println("-----------------");
System.out.println("SearchName");
me.search(new QueryParser(VERSION,"name",me.analyzer).parse("house107"));
System.out.println("-----------------");
System.out.println("Search3Beds");
me.search(new QueryParser(VERSION,"beds",me.analyzer).parse("3"));
System.out.println("-----------------");
System.out.println("Search3BedsInRange");
me.search(new QueryParser(VERSION,"name",me.analyzer).parse("beds:[3 TO 3]"));
}
}
The output of this program is:
SearchByRange
executing query"beds:[3 TO 3]"
Found 2 hits.
1. house103, beds:3
2. house107, beds:3
-----------------
SearchName
executing query"name:house107"
Found 1 hits.
1. house107, beds:3
-----------------
Search3Beds
executing query"beds:3"
Found 0 hits.
-----------------
Search3BedsInRange
executing query"beds:[3 TO 3]"
Found 0 hits.
Upvotes: 0
Views: 5464
Reputation: 4465
It seems like you always have to use the NumericRangeQuery for numeric conditions. (thanks to Mindas) so as he suggested I created My own more intelligent QueryParser. Using the Apache commons-lang function StringUtils.isNumeric() I can create a more generic QueryParser:
public class IntelligentQueryParser extends QueryParser {
// take over super constructors
@Override
protected org.apache.lucene.search.Query newRangeQuery(String field,
String part1, String part2, boolean part1Inclusive, boolean part2Inclusive) {
if(StringUtils.isNumeric(part1))
{
return NumericRangeQuery.newIntRange(field, Integer.parseInt(part1),Integer.parseInt(part2),part1Inclusive,part2Inclusive);
}
return super.newRangeQuery(field, part1, part2, part1Inclusive, part2Inclusive);
}
@Override
protected org.apache.lucene.search.Query newTermQuery(
org.apache.lucene.index.Term term) {
if(StringUtils.isNumeric(term.text()))
{
return NumericRangeQuery.newIntRange(term.field(), Integer.parseInt(term.text()),Integer.parseInt(term.text()),true,true);
}
return super.newTermQuery(term);
}
}
Just wanted to share this.
Upvotes: 0
Reputation: 26703
What you need to do is to write your own QueryParser
:
public class CustomQueryParser extends QueryParser {
// ctor omitted
@Override
public Query newTermQuery(Term term) {
if (term.field().equals("beds")) {
// manually construct and return non-range query for numeric value
} else {
return super.newTermQuery(term);
}
}
@Override
public Query newRangeQuery(String field, String part1, String part2, boolean startInclusive, boolean endInclusive) {
if (field.equals("beds")) {
// manually construct and return range query for numeric value
} else {
return super.newRangeQuery(field, part1, part2, startInclusive, endInclusive);
}
}
}
Upvotes: 0
Reputation: 77
You need to use NumericRangeQuery to perform a search on the numeric field.
The answer here could give you some insight.
Also the answer here says
for numeric values (longs, dates, floats, etc.) you need to have NumericRangeQuery. Otherwise Lucene has no idea how do you want to define similarity.
Upvotes: 1