Leo
Leo

Reputation: 1046

Parse search query with Lucene and build Hibernate criteria based on that

The requirement is to build a simplified searching functionality over a limited number of fields that are kept in a separate single table. Using Solr or like is not an option at the moment, everything has to work within one webapp. The database is MSSQL. What I am trying to do is to utilize Lucene query parser and build Hibernate criteria from that. Despite my initial impression that it wouldn't be too hard, I can't figure out how to build criteria for a complex query.

Here is a quick test I created to parse the query string with Lucene (4.7.2)

Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_47);
QueryParser luceneParser = new QueryParser(Version.LUCENE_47, "", analyzer);
String queryString = "(name:\"Luke Skywalker\" AND father:unknown OR fname:Luke) or (name:yoda)";
Query luceneQuery = luceneParser.parse(queryString);

....

public class QueryInterpreter {
    public void parse(Query query) {
        if (query instanceof TermQuery) {
            termQuery((TermQuery) query);
        } else if (query instanceof BooleanQuery) {
            booleanQuery((BooleanQuery) query);
        } else if (query instanceof PhraseQuery) {
            phraseQuery((PhraseQuery) query);
        } else {
            throw new IllegalArgumentException("");
        }
    }
    public void booleanQuery(BooleanQuery query) {
        for (BooleanClause clause : query.getClauses()) {
            parse(clause.getQuery());
        }
    }
    public void phraseQuery(PhraseQuery query) {
        StringBuilder sb = new StringBuilder();
        for (Term term : query.getTerms()) {
            sb.append(term.text());
            sb.append(" ");
        }

    }
    public void termQuery(TermQuery query) {
        Term term = query.getTerm();
    }
}

Lucene first thing coverts the search string into (+name:\"Luke Skywalker\" +father:unknown fname:Luke) name:yoda. Basically then it iterates through terms with isRequired() set for each of them. Hibernate works differently - you create a criteria object and keep adding Criterions with pairs of values. And I cannot figure out the way how to convert one to another. What I think I need is a generic purpose Junction object to attach Criterions to.

Upvotes: 0

Views: 1306

Answers (1)

Leo
Leo

Reputation: 1046

Finally figured it out, will share my solution here in case someone will be facing the same problem.

The first step in the right direction was to realise that QueryParser doesn't deal with boolean logic well. For example (+name:\"Luke Skywalker\" +father:unknown fname:Luke) name:yoda is a different search from (name:\"Luke Skywalker\" AND father:unknown OR fname:Luke) or (name:yoda). Don't know why QueryParser even accepts boolean logic, it's just plain confusing.

The solution is to use PrecedenceQueryParser.

Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_47);
PrecedenceQueryParser luceneParser = new PrecedenceQueryParser(analyzer);
luceneParser.setAllowLeadingWildcard(true);
Query luceneQuery = luceneParser.parse(searchQuery, "name");

And then create a Hibernate Criterion from that. Obviously you can't support full range of Lucene search capabilities in a relational database, but that's never been a requirement.

public Criterion buildHibernateQuery(Query luceneQuery) {
    return parse(luceneQuery);
}

private Criterion parse(Query query) {
    if (query instanceof TermQuery) {
        return parse((TermQuery) query);
    } else if (query instanceof BooleanQuery) {
        return parse((BooleanQuery) query);
    } else if (query instanceof PhraseQuery) {
        return parse((PhraseQuery) query);
    } else if (query instanceof PrefixQuery) {
        return parse((PrefixQuery) query);
    } else if (query instanceof WildcardQuery) {
        return parse((WildcardQuery) query);
    } else {
        LOG.error(String.format("%s unsupported", query.getClass()));
    }
}

private Criterion parse(TermQuery query) {
    Term term = query.getTerm();
    return createNameValueRestriction(term.field(), term.text());
}

private Criterion parse(BooleanQuery query) {
    if (query.getClauses().length == 1) {
        return parse(query.getClauses()[0].getQuery());
    }
    Junction junction = createJunction(query.getClauses()[0]);

    for (BooleanClause clause: query.getClauses()) {
        junction.add(parse(clause.getQuery()));
    }
    return junction;
}

private Junction createJunction(BooleanClause booleanClause) {
    if (booleanClause.isRequired()) {
        return Restrictions.conjunction();
    } else {
        return Restrictions.disjunction();
    }
}

private Criterion parse(PhraseQuery query) {
    String field = query.getTerms()[0].field();
    StringBuilder phraseBuilder = new StringBuilder();
    for (Term term : query.getTerms()) {
        phraseBuilder.append(term.text());
        phraseBuilder.append(" ");
    }

    return createNameValueRestriction(field, phraseBuilder.toString().trim());
}

private Criterion createNameValueRestriction(String field, String value) {
    return Restrictions.and(
            Restrictions.eq("jsonPath", field),
            Restrictions.eq("answer", value)
            );
}

private Criterion parse(PrefixQuery query) {
    Term term = query.getPrefix();
    return parseLikeQuery(term.field(), term.text(), MatchMode.START);
}

private Criterion parse(WildcardQuery query) {
    Term term = query.getTerm();
    String wildCardEscaped = Pattern.quote(String.valueOf(WildcardQuery.WILDCARD_STRING));
    String termText = term.text().replaceAll(wildCardEscaped, "");
    return parseLikeQuery(term.field(), termText, MatchMode.ANYWHERE);
}


private Criterion parseLikeQuery(String field, String value, MatchMode matchMode) {
    return Restrictions.and(
            Restrictions.eq("jsonPath", field),
            Restrictions.like("answer", value, matchMode)
            );
}

Hope someone will find this useful.

Upvotes: 2

Related Questions