How to fetch multiple lists with EntityGraph

I've been too much time trying to do something that seems simple at first look:

Let's say I have a Store entity with two @OneToMany relationships:

@Entity
public class Store {

    private String name;

    @OneToMany(mappedBy = "store")
    private List<Foo> foos;

    @OneToMany(mappedBy = "store")
    private List<Bar> bars;

}

I want to fetch both lists with the same query. I want to use Entity Graphs, because I need to decide in execution time which children I want to fetch.

@EntityGraph(attributePaths = {"foos", "bars"})
Store findByName(String name);

This causes:

MultipleBagFetchException: cannot simultaneously fetch multiple bags.

I don't want to use Set instead of List, because the purpose of this approach is to avoid the N+1 query problem and the Cartesian product problem (the entity hierarchy is much bigger than the example), so no lazy fetching or changing List into Set do help in this case.

Also, the first answer proposed here does not work with Entity Graphs.

Any ideas?

Upvotes: 8

Views: 9241

Answers (1)

Eugene
Eugene

Reputation: 5985

The main goal of the Entity Graph is to improve the runtime performance when loading the entity's related associations and basic fields. It is solving N+1 problem. Hibernate loads all the graph in one select query and then avoids fetching association with more SELECT queries.

It needs to be understood that by the current implementation, Entity Graph generating for execution only one SELECT query, because there is no possibility to specify other Fetch Mode than defined by default FetchMode.JOIN for all attribute paths in the graph.

It is perfect for relations like ManyToOne, OneToOne, and for single OneToMany because all of them can be loaded by one query.

But if an entity has multiple OneToMany relations it is a problem to load full data by one query, because of Cartesian join. If we have 50 Store rows associated with 20 Foo and 10 Bar rows, the final result set will contain 10_000 records (e.g., 50 x 20 x 10). It is terrible for performance!

This is the main reason for MultipleBagFetchException. Hibernate protects you from Cartesian join.

So we can load two collections only thru separated select queries.

For now, Entity Graph does not support such loading.

There is a pending not resolved issue and discussion.

The same thing is also with Fetch Profiles. Only FetchMode.JOIN is currently supported.

See good explanation The best way to fix the Hibernate MultipleBagFetchException

Solution 1: two JPQL queries

Instead of executing a single query that fetches two associations, we can execute two JPQL queries:

public interface StoreRepository extends JpaRepository<Store, Long> {

    default List<Store> findStoresByNameWithCollections(String name) {
        List<Store> stores = findByName(name);
        if (stores != null && !stores.isEmpty()) {
            stores = loadStoreBars(stores);
        }
        return stores;
    }

    @Query(value = "select DISTINCT st from Store st LEFT JOIN FETCH st.foos where st.name like :name")
    @QueryHints(value = { @QueryHint(name = org.hibernate.jpa.QueryHints.HINT_PASS_DISTINCT_THROUGH, value = "false" )}, forCounting = false)
    List<Store> findByName(String name);

    @Query(value = "select DISTINCT st from Store st LEFT JOIN FETCH st.bars where st in (:stores)")
    @QueryHints(value = { @QueryHint(name = org.hibernate.jpa.QueryHints.HINT_PASS_DISTINCT_THROUGH, value = "false" )}, forCounting = false)
    List<Store> loadStoreBars(List<Store> stores);
}

hibernate.query.passDistinctThrough JPA query hint we used to instruct Hibernate to prevent passing the JPQL DISTINCT keyword to the underlying SQL query. Unique of parent entities will be performed on the server-side.

Generated queries:

--Parent entity with Foos collection
    select
        store0_.id as id1_36_0_,
        foos1_.id as id1_15_1_,
        store0_.name as name2_36_0_,
        foos1_.store_id as store_id2_15_1_,
        foos1_.store_id as store_id2_15_0__,
        foos1_.id as id1_15_0__ 
    from
        store store0_ 
    left outer join
        foo foos1_ 
            on store0_.id=foos1_.store_id 
    where
        store0_.name like ?

--Bars collection
    select
        store0_.id as id1_36_0_,
        bars1_.id as id1_5_1_,
        store0_.name as name2_36_0_,
        bars1_.store_id as store_id2_5_1_,
        bars1_.store_id as store_id2_5_0__,
        bars1_.id as id1_5_0__ 
    from
        store store0_ 
    left outer join
        bar bars1_ 
            on store0_.id=bars1_.store_id 
    where
        store0_.id in (
            ? , ?
        )

Solution 2: FetchMode.SUBSELECT

@Entity
public class Store {

    @Id
    private Long id;

    private String name;

    @OneToMany(mappedBy = "store", fetch = FetchType.EAGER)
    @Fetch(FetchMode.SUBSELECT)
    private List<Foo> foos;

    @OneToMany(mappedBy = "store", fetch = FetchType.EAGER)
    @Fetch(FetchMode.SUBSELECT)
    private List<Bar> bars;
}

Hibernate is going to avoid the N + 1 query issue by generating a single SQL statement to initialize all collections for all Store entities that were previously fetched. Instead of passing all entity identifiers, Hibernate simply reruns the previous query that fetched the Store entities.

Generated queries:

--Parent entities
    select
        store0_.id as id1_36_,
        store0_.name as name2_36_ 
    from
        store store0_ 
    where
        store0_.name like ?

--Foos collection
    select
        foos0_.store_id as store_id2_15_1_,
        foos0_.id as id1_15_1_,
        foos0_.id as id1_15_0_,
        foos0_.store_id as store_id2_15_0_ 
    from
        foo foos0_ 
    where
        foos0_.store_id in (
            select
                store0_.id 
            from
                store store0_ 
            where
                store0_.name like ?
        )

--Bars collection
    select
        bars0_.store_id as store_id2_5_1_,
        bars0_.id as id1_5_1_,
        bars0_.id as id1_5_0_,
        bars0_.store_id as store_id2_5_0_ 
    from
        bar bars0_ 
    where
        bars0_.store_id in (
            select
                store0_.id 
            from
                store store0_ 
            where
                store0_.name like ?
        )

Solution 3: FetchMode.SUBSELECT + EntityGraph.EntityGraphType.LOAD

When the javax.persistence.loadgraph property is used to specify an entity graph, attributes that are specified by attribute nodes of the entity graph are treated as FetchType.EAGER and attributes that are not specified are treated according to their specified or default FetchType.

Collections that are not specified in graph will use their own Fetch Mode.

public interface StoreRepository extends JpaRepository<Store, Long> {
    @EntityGraph(attributePaths = {"foos"}, type = EntityGraph.EntityGraphType.LOAD)
    List<Store> findByName(String name);
}

public class Store {

    @Id
    private Long id;

    private String name;

    @OneToMany(mappedBy = "store")
    private List<Foo> foos;

    @OneToMany(mappedBy = "store", fetch = FetchType.EAGER)
    @Fetch(FetchMode.SUBSELECT)
    private List<Bar> bars;
}

Generated queries:

    select
        store0_.id as id1_36_0_,
        foos1_.id as id1_15_1_,
        store0_.name as name2_36_0_,
        foos1_.store_id as store_id2_15_1_,
        foos1_.store_id as store_id2_15_0__,
        foos1_.id as id1_15_0__ 
    from
        store store0_ 
    left outer join
        foo foos1_ 
            on store0_.id=foos1_.store_id 
    where
        store0_.name=?

    select
        bars0_.store_id as store_id2_5_1_,
        bars0_.id as id1_5_1_,
        bars0_.id as id1_5_0_,
        bars0_.store_id as store_id2_5_0_ 
    from
        bar bars0_ 
    where
        bars0_.store_id in (
            select
                store0_.id 
            from
                store store0_ 
            left outer join
                foo foos1_ 
                    on store0_.id=foos1_.store_id 
            where
                store0_.name=?
        )

Solution 4, BatchSize

However, although @BatchSize is better than running into an N+1 query issue, most of the time, Solution 1 and Solution 2 is a much better alternative since it allows you to fetch all the required data with minimum queries count.

Batch size is tricky. It is a coefficient to determine the count of collections for loading in one query, it is not an exact collections count.

For example if you have 20 parent records and collection with batch size 50, hibernate will generate 1 query for loading parent entities and then 2 queries for initializing collections(First: 12 collections and Second:8 collections).

public class Store {

    @Id
    private Long id;

    private String name;

    @BatchSize(size = 50)
    @OneToMany(mappedBy = "store")
    private List<Foo> foos;

    @BatchSize(size = 50)
    @OneToMany(mappedBy = "store")
    private List<Bar> bars;
}

Generated queries:

--Parent entities
    select
        store0_.id as id1_36_,
        store0_.name as name2_36_ 
    from
        store store0_ 
    where
        store0_.name=?

--Bar collections
    select
        bars0_.store_id as store_id2_5_1_,
        bars0_.id as id1_5_1_,
        bars0_.id as id1_5_0_,
        bars0_.store_id as store_id2_5_0_ 
    from
        bar bars0_ 
    where
        bars0_.store_id in (
            ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?
        )

    select
        bars0_.store_id as store_id2_5_1_,
        bars0_.id as id1_5_1_,
        bars0_.id as id1_5_0_,
        bars0_.store_id as store_id2_5_0_ 
    from
        bar bars0_ 
    where
        bars0_.store_id in (
            ?, ?, ?, ?, ?, ?, ?, ?
        )

--Foo collections
    select
        foos0_.store_id as store_id2_15_1_,
        foos0_.id as id1_15_1_,
        foos0_.id as id1_15_0_,
        foos0_.store_id as store_id2_15_0_ 
    from
        foo foos0_ 
    where
        foos0_.store_id in (
            ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?
        )


    select
        foos0_.store_id as store_id2_15_1_,
        foos0_.id as id1_15_1_,
        foos0_.id as id1_15_0_,
        foos0_.store_id as store_id2_15_0_ 
    from
        foo foos0_ 
    where
        foos0_.store_id in (
            ?, ?, ?, ?, ?, ?, ?, ?
        )

Upvotes: 19

Related Questions