spierepf
spierepf

Reputation: 2934

`SELECT COUNT(*) FROM (SELECT DISTINCT ...)` in Hibernate or JPA

This feels like a trivial use case for Hibernate or JPA, but I've been struggling for a couple of days to get this to work.

I have an position entity class that has latitude, longitude and updateTime fields (among others). I would like to count the number of distinct combinations of those three fields while ignoring the others. In SQL, this is trivial:

SELECT COUNT(*) FROM (SELECT DISTINCT LONGITUDE, LATITUDE, UPDATE_TIME FROM POSITION) AS TEMP;

It is important that I abstract myh database implementation from the rest of my application because different users may wish to use different database engines. (Heck I use h2 for testing and mariadb for local production...)

I have been trying to translate this SQL into Java code using either Hibernate or JPA syntax, but I cannot figure out how.

EDIT - Here is as close as I have been able to get using JPA (ref: https://en.wikibooks.org/wiki/Java_Persistence/Criteria)

public long getCountDistinctInFlightPositions() {
    Session session = sessionFactory.openSession();

    CriteriaBuilder criteriaBuilder = session.getCriteriaBuilder();

    CriteriaQuery<Tuple> innerQuery = criteriaBuilder.createTupleQuery();
    Root<Position> position = innerQuery.from(Position.class);
    innerQuery.multiselect(
        position.get("longitude"),
        position.get("latitude"),
        position.get("updateTime")
    );

    // The method countDistinct(Expression<?>) in the type CriteriaBuilder is not applicable for the arguments (CriteriaQuery<Tuple>)

    criteriaBuilder.countDistinct(innerQuery);

    return 1;
}

Upvotes: 0

Views: 2717

Answers (1)

Naros
Naros

Reputation: 21153

You can do it this way:

CriteriaQuery<Long> countQuery = cb.createQuery( Long.class );
Root<Position> root = countQuery.from( Position.class );

countQuery.select( cb.count( root.get( "id" ) ) );

Subquery<Integer> subQuery = countQuery.subquery( Integer.class );
Root<Position> subRoot = subQuery.from( Position.class );
subQuery.select( cb.min( subRoot.get( "id" ) ) );
subQuery.groupBy( subRoot.get( "longitude" ), 
  subRoot.get( "latitude" ), 
  subRoot.get( "updateTime" ) );

countQuery.where( root.get( "id" ).in( subQuery ) );

Long count = entityManager.createQuery( countQuery ).getSingleResult();

This effectively generates the following SQL:

SELECT COUNT( p0.id ) FROM Position p0
 WHERE p0.id IN (
   SELECT MIN( p1.id )
     FROM Position p1
    GROUP BY p1.longitude, p1.latitude, p1.updateTime )

In a scenario where I have 3 rows and 2 of them have the same tuple of longitude, latitude, and update time, the query will return a result of 2.

Make sure you maintain a good index on [Longtitude, Latitude, UpdateTime] here so that you can take advantage of faster GROUP BY execution. The PK is already b-tree indexed so the other operations wrt COUNT/MIN should be accounted for easily by that index already.

Upvotes: 1

Related Questions