Reputation: 54561
Given an instance of Query
is it possible to somehow check whether that instance happens to represent a query that always matches all the documents in the index?
For example, a MatchAllDocsQuery
or a BooleanQuery
that contains a MatchAllDocs
clause are such queries that always return all the documents. Another example is a BooleanQuery
that has a SHOULD-match clause that has a nested SHOULD-match clause that has a MatchAllDocs
inside it.
Note that a query that happens to return everything because it has all the possible terms in it or because the index is empty doesn't count as a query that always return all the documents. In other words, I would like to check whether a given query always returns everything no matter what the index contains.
Is it possible or at least approximately possible? I'll accept an answer with a solution that doesn't work for any conceivable case if it works for any query that can be returned from Solr's Extended Dismax Query Parser.
Upvotes: 2
Views: 641
Reputation: 33351
A BooleanQuery
that contains a MatchAllDocsQuery
as one of it's clauses doesn't necessarily return all documents, as the BooleanQuery
may also contain other MUST
or MUST_NOT
clauses which would restrict the result set. I don't believe there is anything the does this out of the box, and trying to handle any sort of query that Solr might split out would be difficult. You would need to move through the queries recursively to ensure that everything effectively reduces to a MatchAllDocsQuery
, ignoring scores.
Something like (this is entirely untested at this point):
boolean willMatchAll(Query query) {
if (query instanceof MatchAllDocsQuery)
return true;
}
else if (query instanceof BooleanQuery) {
boolean foundMatchAll = false;
for (BooleanClause clause : ((BooleanQuery)query).getClauses()) {
if (clause.isProhibited()) {
return false; //A reasonable assumption, that the MUST_NOT clauses won't be empty
}
else if (clause.isRequired()) {
if (willMatchAll(clause.getQuery())) {
foundMatchAll = true;
} else {
return false; //any MUST clause that is not a matchall means the boolean query will not match all
}
}
else {
if (willMatchAll(clause.getQuery())) {
foundMatchAll = true;
}
}
}
//If a matchall has been found, and we haven't return false yet, this boolean query matches all documents
return foundMatchAll;
}
else if (query instanceof DisjunctionMaxQuery) {
boolean isMatchAll = false
//If any disjunct is a matchall, the query will match all documents
for (Query subquery : ((DisjunctuionMaxQuery)query).getDisjuncts()) {
isMatchAll = isMatchAll || willMatchAll(subquery);
}
return isMatchAll;
}
else if (query instanceof ConstantScoreQuery) {
//Traverse right through ConstantScoreQuery. The wrapper isn't of interest here.
Query subquery = ((ConstantScoreQuery)query).getQuery()
if (subquery == null) {
return false; //It wraps a filter, not a query, and I don't believe a filter can be a matchall
}
return willMatchAll(subquery);
}
else {
//No other standard queries may be or contain MatchAllDocsQueries, I don't believe.
//Even a double open-ended range query restricts the results to those with a value in the specified field.
return false;
}
}
And if you also wanted to handle the stuff in org.apache.lucene.queries
, there would be more query types to handle, like BoostingQuery
and CustomScoreQuery
, among others. But hopefully that gives some sort of idea on it.
Upvotes: 1
Reputation: 1797
Good question, i am wondering if you can do : search and get numFound and compare that to see if your actual Query returns same numFound value. Am i missing something?
Upvotes: 0