St.Antario
St.Antario

Reputation: 27385

Do we need to synchronize access to an array when the array variable is volatile?

I have a class containing a volatile reference to an array:

private volatile Object[] objects = new Object[100];

Now, I can guarantee that, only one thread (call it writer) can write to the array. For example,

objects[10] = new Object();

All other threads will only read values written by the writer thread.

Question: Do I need to synchronize such reads and writes in order to ensure memory consistency?

I presume, yes I should. Because it would not be useful from performance standpoint if JVM provides some kind of memory consistency guarantees when writing to an array. But I'm not sure about that. Didn't find anything helpful in documentation.

Upvotes: 18

Views: 2553

Answers (4)

Alex Salauyou
Alex Salauyou

Reputation: 14338

You may use AtomicReferenceArray:

final AtomicReferenceArray<Object> objects = new AtomicReferenceArray<>(100);

// writer
objects.set(10, new Object());

// reader
Object obj = objects.get(10);

This will ensure atomic updates and happens-before consistency of read/write operations, the same as if each item of array was volatile.

Upvotes: 14

Boann
Boann

Reputation: 50041

Per JLS § 17.4.5 – Happens-before Order:

Two actions can be ordered by a happens-before relationship. If one action happens-before another, then the first is visible to and ordered before the second.

[...]

A write to a volatile field happens-before every subsequent read of that field.

The happens-before relation is quite strong. It means that if thread A writes to a volatile variable, and any thread B later reads the variable, then thread B is guaranteed to see the change to the volatile variable itself, as well as every other change thread A made before setting the volatile variable, including to any other objects whether or not they were otherwise volatile.

However, this is not enough!

The element assignment objects[10] = new Object(); is not a write of the variable objects. It's only a read of the variable to determine the array which it points at, followed by a write to a different variable that is contained within the array object located somewhere else in memory. No happens-before relation is established by mere reads to volatile variables, so that code is not safe.

As @DimitarDimitrov points out, you can kludge around this by doing a dummy write to the objects variable. Each pair of operations – the objects = objects; reassignment by the writer thread coupled with a foo = objects[x]; lookup by a reader thread – defines an updated happens-before relation, and thus will "publish" all of the latest changes made by the writer thread to the reader thread. That can work, but it requires discipline, and it's not elegant.

But there is a more subtle problem with that: Even if the reader thread sees the updated value of the array element that still doesn't guarantee that it sees the fields of the object referred to by that element correctly, because the following order is possible:

  1. Writer creates some object foo.
  2. Writer sets objects[x] = foo;
  3. Reader checks objects[x] and sees the reference to the new object foo (which it can do, although it is not guaranteed to do so since there is no happens-before relationship yet).
  4. Writer does objects = objects;

Unfortunately, this doesn't define the formal happens-before relationship, because the volatile variable read (3) came before the volatile variable write (4). Although the reader can see that objects[x] is the object foo by chance, this doesn't mean that the fields of foo are safely published, so the reader may theoretically see the new object, but with the wrong values! To solve that, the objects you're sharing between threads using this technique would need to have all fields final or volatile or otherwise synchronized. If the objects are all Strings for example, you'll be fine, but otherwise, it is too easy to make mistakes with this. (Thank you @Holger for pointing this out.)


Here are some less flaky alternatives:

  • The concurrent array classes like AtomicReferenceArray exist to provide arrays in which every element behaves as if volatile. This is much easier to use correctly, because it ensures that if a reader sees the updated array element value, it also correctly sees the object referred to by that element.

  • You can wrap all accesses to the array in synchronized blocks, synchronizing on some shared object:

    // writer
    synchronized (aSharedObject) {
        objects[x] = foo;
    }
    
    // reader
    synchronized (aSharedObject) {
        bar = objects[x];
    }
    

    Like volatile, using synchronized creates a happens-before relationship. (Everything a thread does before releasing the synchronization lock of an object happens-before any other thread acquires the synchronization lock of the same object.) If you do this, your array does not need to be volatile.

  • Consider if an array is really what you need here. You haven't said what these writer and reader threads are for, but if you want some kind of producer-consumer queue, then the class you really need is a BlockingQueue or an Executor. You should look around the Java concurrency classes to see if one of them already does what you need, because if one does, it will certainly be easier to use correctly than volatile.

Upvotes: 8

Cootri
Cootri

Reputation: 3836

private volatile Object[] objects = new Object[100];

You make only objects reference to be volatile this way. Not the array instance contents that is associated.

Question: Do I need to synchronize such reads and writes in order to ensure memory consistency?

Yes.

it would not be useful from performance standpoint if JVM provides some kind of memory consistency guarantees when writing to an array

consider using collections like CopyOnWriteArrayList (or your own array wrapper with some Lock implementation inside mutators and read methods).

Java platform also has Vector (obsolete with flawed design) and synchronized List (slow for many scenarios), but I do not recommend to use them.

PS: One more good idea from @SashaSalauyou

Upvotes: 13

Dimitar Dimitrov
Dimitar Dimitrov

Reputation: 16357

Yes, you need to synchronize accesses to the elements of a volatile array.

Other folks have already addressed how you can probably use CopyOnWriteArrayList or AtomicReferenceArray instead, so I'm going to veer off into a slightly different direction. I'd also recommend reading Volatile Arrays in Java by one of the big JMM contributors, Jeremy Manson.

Now, I can gurantee that the only one thread (call it writer) can write to the array as e.g. follows:

Whether you can give single writer guarantees or not is not in any way related to the volatile keyword. I think you didn't have that in mind, but I'm just clarifying, so that other readers don't get the wrong impression (I think there's a data race pun that can be made with that sentence).

All other threads will only read values written by the writer thread.

Yes, but like your intuition correctly lead you, this holds only for the value of the reference to the array. This means that unless you are writing array references to the volatile variable, you won't get the write part of the volatile write-read contract.

What this means is that either you want to do something like

objects[i] = newObj;
objects = objects;

which is ugly and awful in many different ways. Or you want to publish a brand new array each time your writer makes an update, e.g.

Object[] newObjects = new Object[100];

// populate values in newObjects, make sure that newObjects IS NOT published yet

// publish newObjects through the volatile variable
objects = newObjects;

which is not a very common use-case.

Notice that unlike setting array elements, which does not provide volatile-write semantics, getting array elements (with newObj = objects[i];) does provide volatile-read semantics, because you are dereferencing the array :)

Because it would not be useful from performance standpoint if JVM provides some kind of memory consistency guarantess when writing to an array. But I'm not sure about that.

Like you're alluding, ensuring the memory fencing required for volatile semantics will be very costly, and if you add false sharing to the mix, it becomes even worse.

Didn't find anything helpful in documentation.

You can safely assume then that the volatile semantics for array references are exactly the same as the volatile semantics for non-array references, which is not surprising at all, considering how arrays (even primitive ones) are still objects.

Upvotes: 7

Related Questions