Reputation: 61
I need some help doing a search. Say I have a really simple document structure, just 1 field, labeled name. I need to retrieve all the names whose length is more or less than a specified value. By length I mean String.length(). A range filter seems close in concept, but I couldn't find a good example to write my specific case. Thanks for the help.
Upvotes: 4
Views: 3262
Reputation: 61
Add a NumericField using the length, then use a RangeQuery. See NumericField javadoc's for an example.
Upvotes: 2
Reputation: 27923
This is a classic example of a MultiTermQuery. It's not in the box, but easy to implement. Take a look at WildCardQuery
which extends MultiTermQuery
. This does something very similar. Just use a different FilterredTermEnum like this one which uses the length of the term.text to filter the terms (not the term text itself).
The magic happens here (this code is in the custom term enumerator at the bottom of my post):
protected internal override bool TermCompare(Term term)
{
if (field == term.Field())
{
System.String searchText = term.Text();
if (searchText.Length >= text.Length())
{
return true;
}
}
endEnum = true;
return false;
}
The above code looks through all the terms for the field and checks their lengths against the length of the term passed in the constructor. It yields true for any field that is at least that long.
public class MinLengthQuery : MultiTermQuery
{
public MinLengthQuery(Term term) : base(term)
{
}
protected internal override FilteredTermEnum GetEnum(IndexReader reader)
{
return new MinLengthTermEnum(reader, GetTerm());
}
}
This class does all the work:
public class MinLengthTermEnum : FilteredTermEnum
{
internal Term searchTerm;
internal System.String field = "";
internal System.String text = "";
internal System.String pre = "";
internal int preLen = 0;
internal bool endEnum = false;
public MinLengthTermEnum(IndexReader reader, Term term):base()
{
searchTerm = term;
field = searchTerm.Field();
text = searchTerm.Text();
SetEnum(reader.Terms(new Term(searchTerm.Field(), "")));
}
protected internal override bool TermCompare(Term term)
{
if (field == term.Field())
{
System.String searchText = term.Text();
if (searchText.Length >= text.Length())
{
return true;
}
}
endEnum = true;
return false;
}
public override float Difference()
{
return 1.0f;
}
public override bool EndEnum()
{
return endEnum;
}
public override void Close()
{
base.Close();
searchTerm = null;
field = null;
text = null;
}
}
(I'm a lucene.net guy, but the translation ought be be easy enough... It would probably be easier to start with your version of Lucene's source code for WildCardQuery and TermEnum and work from it).
Upvotes: 0