For JPA-Entities in a project I work on, properties of type List or Map are always initialized to the synchronized implementations Vector and Hashtable. (Unsynchronized ArrayList and HashMap are the standard implementations in Java, except if synchronization is really needed.) Does anyone know a reason why synchronized Collections would be needed? We use EclipseLink. When I asked about it, nobody knew why it was done like that. It seems it was always done like this. Maybe this was needed for an old version of EclipseLink? I'm asking for two reasons: I would prefer to use the standard implementations ArrayList and HashMap like anywhere else. If that's safe. There's no matching synchronized Set implementation in the JDK. At least not a serializable one as EclipseLink expects. Example Entity: @Entity public class Person { ... @ManyToMany(cascade=CascadeType.ALL) @JoinTable( ... ) private List<Role> accessRoles; @ElementCollection @CollectionTable( ... ) @MapKeyColumn(name="KEY") @Column(name="VALUE") private Map<String, String> attrs; public Person() { // Why Vector/Hashtable instead of ArrayList/HashMap? accessRoles = new Vector<Role>(); attrs = new Hashtable<String, String>(); } public List<Role> getAccessRoles() { return accessRoles; } public void setAccessRoles(List<Role> accessRoles) { this.accessRoles = accessRoles; } public Map<String, String> getAttrs() { return attrs; } public void setAttrs(Map<String, String> attrs) { this.attrs = attrs; } }

Reputation: 4233

Any reason to initialize Entity properties with synchronized Collections?

For JPA-Entities in a project I work on, properties of type List or Map are always initialized to the synchronized implementations Vector and Hashtable.
(Unsynchronized ArrayList and HashMap are the standard implementations in Java, except if synchronization is really needed.)

Does anyone know a reason why synchronized Collections would be needed? We use EclipseLink.

When I asked about it, nobody knew why it was done like that. It seems it was always done like this. Maybe this was needed for an old version of EclipseLink?

I'm asking for two reasons:

I would prefer to use the standard implementations ArrayList and HashMap like anywhere else. If that's safe.
There's no matching synchronized Set implementation in the JDK. At least not a serializable one as EclipseLink expects.

Example Entity:

@Entity
public class Person {
    ...

    @ManyToMany(cascade=CascadeType.ALL)
    @JoinTable( ... )
    private List<Role> accessRoles;


    @ElementCollection
    @CollectionTable( ... )
    @MapKeyColumn(name="KEY")
    @Column(name="VALUE")
    private Map<String, String> attrs;

    public Person() {
        // Why Vector/Hashtable instead of ArrayList/HashMap?
        accessRoles = new Vector<Role>();
        attrs = new Hashtable<String, String>();
    }

    public List<Role> getAccessRoles() {
        return accessRoles;
    }

    public void setAccessRoles(List<Role> accessRoles) {
        this.accessRoles = accessRoles;
    }

    public Map<String, String> getAttrs() {
        return attrs;
    }

    public void setAttrs(Map<String, String> attrs) {
        this.attrs = attrs;
    }
}

Upvotes: 1

Answers (5)

V G

Reputation: 19002

As @flup answered with some interesting references, I could only make some additional presumptions:

The team that developed and/or the specifications simply were unaware of the Collection API.
The team wanted to use the code in a highly concurrent environment (either in your Web application, like passing some entities to some other threads or in another desktop application, as JPA is not limited to WEB applications only). Also do note, that IndirectSet is not thread-safe, so meaning that if the team wanted to write some thread-safe code, they should have taken some additional measures (if they use Sets)!

Upvotes: 0

Shailendra

Reputation: 9102

Going through the code base of eclipselink, it looks like the usage of vector is inherited from older code base and is much like the Vector class itself - legacy. Somehow the intent was to use Vector to allow multiple threads to act safely on the relationships which are loaded lazily - "indirection" in eclipselink parlance. (More on the concepts here- the different types of indirection discussed being ValueHolder indirection, Transparent Indirection, Proxy indirection etc.) However typically the entities and their relationships are not shared among multiple threads in usual use-cases. Each thread gets it's own copy of entity and its relationships if accessed in their own unit of work.

In case of ValueHoder indirection - one of the implementations of ValueHoderInterface is ValueHoder which is typically initialized with a vector. The relevant part of code is below along with the code comment as is. The comments are interesting as well

IndirectList.java
..........................
.........................

/**
     * INTERNAL:
     * Return the valueHolder.
     * This method used to be synchronized, which caused deadlock.
     */
    public ValueHolderInterface getValueHolder() {
        // PERF: lazy initialize value holder and vector as are normally set after creation.
        if (valueHolder == null) {
            synchronized(this) {
                if (valueHolder == null) {
                        valueHolder = new ValueHolder(new Vector(this.initialCapacity, this.capacityIncrement));
                }
            }
        }
        return valueHolder;
    }

...................
..................

Also there were few issues reported due to the usage of Vector as mentioned here and here.

Upvotes: 1

Mirko Klemm

Reputation: 2068

Much of the internals of EclipseLink date back to a time when Vector and Hashtable were the standard collection types in Java. EclipseLink was TopLink back then, which originated from a persistence framework for Smalltalk - so, much of EclipseLinks code is actually older than Java itself, so to speak. For many years I have worked with TopLink, and always their standard mappings for collection properties used Vector and Hashtable. To me, the only reasonable explanation for Vector and Hashtable still appearing in EclipseLink is that it has been working like this for a long time and - because it is working - hitherto no one has gotten around to changing it.

For myself, I wouldn't ever use Vector or Hashtable again. If I need a synchronized collection, I'd rather use the SynchronizedList ...Map etc. APIs. Just my 2 ct.

Upvotes: 1

flup

Reputation: 27104

There's usually no need for a Vector and an ArrayList is more commonly used. So if your current codebase is full of Vectors, this is a bit of a code smell and it is wise to make sure your team members know what the difference is. See also What are the differences between ArrayList and Vector? and Why is Java Vector class considered obsolete or deprecated?

That does not mean you should do the Big Cleanup and replace all the Vectors in your existing code with ArrayLists.

Your code uses Lists and you won't notice a single difference when programming.
The only advantage to be expected is increased performance.
It is hard to tell if none of your code depends on the synchronization provided by the Vectors.

So, unless you are currently suffering performance issues, or are explicitly (re)designing the synchronization of your entire codebase, you risk introducing hard to fix concurrency bugs without any benefits.

Also, be aware that performance suffers most significantly from the use of Vectors when multiple threads access your collections concurrently. So if you are suffering from performance loss and decide to replace the Vectors for that reason, you'll need to be very careful to keep access sufficiently synchronized.

EDIT: You ask about EclipseLink JPA specifically.

It'd be rather surprising if they demanded you use Vectors and Hashtables since that would mean they ask you to rely on obsolete data structures. In their examples, they use ArrayLists and HashMaps so from that we may conclude that this is indeed not the case.

Diving a bit more specifically into the source code, we can see that their CollectionContainerPolicy uses the Collection interface and does not care about the implementation of your collections. It does, however, surprisingly have special cases for when your internal collection class is Vector. See for instance buildContainerFromVector. And its default container class is Vector, though you can alter that.

The most intrusive moment where EclipseLink and your Lists meet is when you're lazy loading collections. EclipseLink will replace the collection with its own IndirectList which internally uses a Vector. See What collections does jpa return? So in those cases, EclipseLink will give you a Vector anyways(!) and it does not even matter what collection you specify in the collection's initialization.

So EclipseLink indeed has a preference for using Vectors and using Vectors with EclipseLink means less copying of object references from one collection to the other.

Upvotes: 5

Aviad

Reputation: 1549

You don't need synchronized Collections for the JPA, It should be only related to the business logic.. Which i supposed that doesn't need this.. Because you would know.

So basically it is suggested to use not synchronize and it will increase performance.

Upvotes: 0

Any reason to initialize Entity properties with synchronized Collections?

Answers (5)

Related Questions