Nageswaran
Nageswaran

Reputation: 7651

How do I make the QueryParser in Lucene handle numeric ranges?

new QueryParser(.... ).parse (somequery);

it works only for string indexed fields. Say i have a field called count where count is a integer field (while indexing the field I considered the data type)

new QueryParser(....).parse("count:[1 TO 10]");

The above one is not works. Instead If i used "NumericRangeQuery.newIntRange" which is working. But, i need the above one only...

Upvotes: 7

Views: 6549

Answers (5)

huha
huha

Reputation: 4245

I adapted Jeremies answer for C# and Lucene.Net 3.0.3. I also needed type double instead of int. This is my code:

using System.Globalization;
using Lucene.Net.Analysis;
using Lucene.Net.Index;
using Lucene.Net.QueryParsers;
using Lucene.Net.Search;
using Lucene.Net.Util;
using Version = Lucene.Net.Util.Version;

namespace SearchServer.SearchEngine
{
    internal class SearchQueryParser : QueryParser
    {
        public SearchQueryParser(Analyzer analyzer)
            : base(Version.LUCENE_30, null, analyzer)
        {
        }

        private const NumberStyles DblNumberStyles = NumberStyles.AllowLeadingWhite | NumberStyles.AllowTrailingWhite | NumberStyles.AllowLeadingSign | NumberStyles.AllowDecimalPoint;

        protected override Query NewRangeQuery(string field, string part1, string part2, bool inclusive)
        {
            if (field == "p")
            {
                double part1Dbl;
                if (!double.TryParse(part1, DblNumberStyles, CultureInfo.InvariantCulture, out part1Dbl))
                    throw new ParseException($"Error parsing value {part1} for field {field} as double.");
                double part2Dbl;
                if (!double.TryParse(part2, DblNumberStyles, CultureInfo.InvariantCulture, out part2Dbl))
                    throw new ParseException($"Error parsing value {part2} for field {field} as double.");
                return NumericRangeQuery.NewDoubleRange(field, part1Dbl, part2Dbl, inclusive, inclusive);
            }
            return base.NewRangeQuery(field, part1, part2, inclusive);
        }

        protected override Query NewTermQuery(Term term)
        {
            if (term.Field == "p")
            {
                double dblParsed;
                if (!double.TryParse(term.Text, DblNumberStyles, CultureInfo.InvariantCulture, out dblParsed))
                    throw new ParseException($"Error parsing value {term.Text} for field {term.Field} as double.");
                return new TermQuery(new Term(term.Field, NumericUtils.DoubleToPrefixCoded(dblParsed)));
            }
            return base.NewTermQuery(term);
        }
    }
}

I improved my code to also allow queries like greater than and lower than when an asterisk is passed. E.g. p:[* TO 5]

...
    double? part1Dbl = null;
    double tmpDbl;
    if (part1 != "*")
    {
        if (!double.TryParse(part1, DblNumberStyles, CultureInfo.InvariantCulture, out tmpDbl))
            throw new ParseException($"Error parsing value {part1} for field {field} as double.");
        part1Dbl = tmpDbl;
    }
    double? part2Dbl = null;
    if (part2 != "*")
    {
        if (!double.TryParse(part2, DblNumberStyles, CultureInfo.InvariantCulture, out tmpDbl))
            throw new ParseException($"Error parsing value {part2} for field {field} as double.");
        part2Dbl = tmpDbl;
    }
    return NumericRangeQuery.NewDoubleRange(field, part1Dbl, part2Dbl, inclusive, inclusive);
...

Upvotes: 0

E_net4
E_net4

Reputation: 29972

In Lucene 6, the protected method QueryParser#getRangeQuery still exists with the argument list (String fieldName, String low, String high, boolean startInclusive, boolean endInclusive), and overriding it to interpret the range as a numeric range is indeed possible, as long as that information is indexed using one of the new Point fields.

When indexing your field:

document.add(new FloatPoint("_point_count", value)); // index for efficient range based retrieval
document.add(new StoredField("count", value)); // if you need to store the value itself

At your custom query parser (extending queryparser.classic.QueryParser), override the method with something like this:

@Override
protected Query getRangeQuery(String field, String low, String high, boolean startInclusive, boolean endInclusive) throws ParseException
{
    if («isNumericField»(field)) // context dependent
    {
        final String pointField = "_point_" + field;
        return FloatPoint.newRangeQuery(pointField,
                Float.parseFloat(low),
                Float.parseFloat(high));
    }

    return super.getRangeQuery(field, low, high, startInclusive, endInclusive);
}

Upvotes: 1

Jeremie
Jeremie

Reputation: 1307

Had the same issue and solved it, so here I share my solution:

To create a custom query parser that will parse the following query "INTFIELD_NAME:1203" or "INTFIELD_NAME:[1 TO 10]" and handle the field INTFIELD_NAME as an Intfield, I overrided newTermQuery with the following:

public class CustomQueryParser extends QueryParser {

public CustomQueryParser(String f, Analyzer a) {
    super(f, a);
}

protected Query newRangeQuery(String field, String part1, String part2, boolean startInclusive,
    boolean endInclusive) {

    if (INTFIELD_NAME.equals(field)) {
    return NumericRangeQuery.newIntRange(field, Integer.parseInt(part1), Integer.parseInt(part2),
        startInclusive, endInclusive);
    }
    return (TermRangeQuery) super.newRangeQuery(field, part1, part2, startInclusive, endInclusive);
}


protected Query newTermQuery(Term term) {
    if (INTFIELD_NAME.equals(term.field())) {

    BytesRefBuilder byteRefBuilder = new BytesRefBuilder();
    NumericUtils.intToPrefixCoded(Integer.parseInt(term.text()), 0, byteRefBuilder);
    TermQuery tq = new TermQuery(new Term(term.field(), byteRefBuilder.get()));

    return tq;
    } 
    return super.newTermQuery(term);

}
}

I took the code quoted in that thread from http://www.mail-archive.com/[email protected]&q=subject:%22Re%3A+How+do+you+properly+use+NumericField%22&o=newest&f=1 and made 3 modifications :

  • rewrote newRangeQuery a little more nicely

  • replaced in newTermQuery method NumericUtils.intToPrefixCoded(Integer.parseInt(term.text()),NumericUtils.PRECISION_STEP_DEFAULT)));

    by NumericUtils.intToPrefixCoded(Integer.parseInt(term.text()), 0, byteRefBuilder);

when I used this method for the first time in a filter on the same numeric field, I put 0 as I found it as a default value in a lucene class and it just worked.

  • replaced on newTermQuery

    TermQuery tq = new TermQuery(new Term(field,

by TermQuery tq = new TermQuery(new Term(term.field(),

using "field" is wrong, because if your query has several clauses (FIELD:text OR INTFIELD:100), it is taking the first or previous clause field.

Upvotes: 9

Nathan Ridley
Nathan Ridley

Reputation: 34396

You need to inherit from QueryParser and override GetRangeQuery(string field, ...). If field is one of your numeric field names, return an instance of NumericRangeQuery, otherwise return base.GetRangeQuery(...).

There is an example of such an implementation in this thread: http://www.mail-archive.com/[email protected]/msg29062.html

Upvotes: 2

chiku
chiku

Reputation: 258

QueryParser won't create a NumericRangeQuery as it has no way to know whether a field was indexed with NumericField. Just extend the QueryParser to handle this case.

Upvotes: 1

Related Questions