AlanObject
AlanObject

Reputation: 9973

How far I can scale a JPA collection (Map or List)

I am implementing what amounts to an online file system backed by a SQL data base via JPA. For the directory Hierarchy I have created an entity that looks like this:

@Entity
@Table(name = "PATH")
public class Path extends BaseObject implements Serializable {

    @Column(nullable = false)
    private String name;

    @ManyToOne
    private BaseObject reference;

    @OneToMany(fetch = FetchType.LAZY, orphanRemoval = true)
    @MapKey(name = "name")
    private Map<String, Path> members;

    @ManyToOne
    private Path parent;

    //etc

This works amazingly well. A root node is a Path instance where parent is null. With this scheme I can do tree traversals up and down and lookups as painlessly as is possible.

The problem is how far this can be scaled. For example one of my root Path instances will have one member for each user of the system. That's fine for a few dozen or even a few hundred users, what what happens if my site attracts tens or even hundreds of thousands of users?

If I had to I could give up the members column and find member nodes with SQL SELECT operations but I hate to do that. Is there any guideline to determine how big this Map structure has to get before it becomes impractical?

Upvotes: 0

Views: 27

Answers (1)

JB Nizet
JB Nizet

Reputation: 692023

The members association is, by default, lazy. So nothing bad will happen when loading a Path, until... you start using this members field (i.e. call any method on it). Then the thousands of children will be loaded into memory, and that won't scale well, at all. But using a dedicated query won't be better: the thousands of children will be loaded into memory as well.

The problem is just that... you should simply avoid loading all the children of a path in memory. You don't need to do that to create a new Path with an existing parent. You could have to do that to display all the children of a path on the screen, but if you have thousands of them, that isn't realistic, so you'd better use a paginated query to show them by slices of 20 or 50.

So, I would indeed ditch this too dangerous OneToMany association, and use ad-hoc queries to fetch the children you need, in slices. What I would be more concerned about, is the link to the parent. Since you've defined them as eagerly loaded (that's the default for toOne associations), every time you load a path, JPA will load its parent, and its grandparent, and its grand-grandparent, etc. until the root. That can be problematic if your tree is deep. You'd better define the association as lazy (I do it for basically all associations).

Finally, note that your mapping is incorrect. You have a bidirectional association here, and the OneToMany members is thus the inverse side of the ManyToOne parent, so it should be annotated with

@OneToMany(mappedBy = "parent", orphanRemoval = true)

Upvotes: 1

Related Questions