Jay Castell
Jay Castell

Reputation: 68

ArrayIndexOutOfBounds on ArrayList#contains (multithreaded)

I recently came across an unexpected error (in code that isn't mine) that causes an ArrayIndexOutOfBoundsException on ArrayList#contains. The relevant code here is below.

private static final List<String> list = new ArrayList<>();

static void register() {
    update();
    new Timer().schedule(new TimerTask() {
        @Override
        public void run() {
            update();
        }
    }, 0, 21600000);
}

private static void update() {
    list.clear();
    new Thread(() -> {
        List<String> other; //should always be the same length.
        list.addAll(other);
    }).start();
}

public static boolean contains(String string) { //called long after register
    return list.contains(string); //throws ArrayIndexOutOfBounds
}

I'm well aware that ArrayList is not thread safe and that it can be fixed with something like Collections#synchronizedList. My question is to understand how this particular code is throwing an ArrayIndexOutOfBoundsException.

The stacktrace for the exception identifies the following code in ArrayList#indexOf.

for (int i = 0; i < size; i++)
    if (o.equals(elementData[i])) //here
        return i;

It seems to me that this can only happen if size is greater than elementData.length. To my understanding through, ArrayList#clear doesn't actually decrease the length of the backing array. Calls to addAll should only be increasing its capacity, and size is always updated after the array has been expanded. I don't see how this could ever be in a state where size is greater than the capacity of the array.

A particular detail that I noticed is that the delay on the Timer is 0, which means update() is called twice in quick succession. My best guess is that the addAll calls are somehow overlapping, which is leaving the list is some type of mixed invalid state.

If someone could explain what's happening here that would be great!

Upvotes: 1

Views: 421

Answers (1)

Thomas Kl&#228;ger
Thomas Kl&#228;ger

Reputation: 21510

Without proper synchronization there are no guarantees about what changes from one thread are visible in another thread.

No guarantees means it is perfectly valid that if thread A changes elementData and size, thread B sees:

  • none of the changes
  • an updated size, but the old elementData
  • an updated elementData, but the old size
  • both updated values, but no updates on the contents of elementData

As to why this happens (see Wikipedia:Java Memory Model for a short introduction):

In modern computers the main memory is relatively slow, so there are multiple levels of caches between the main memory and the CPU core that executes the current thread.

Thread A may keep updates to size and elementData in CPU registers or one of the several caches and update the main memory sometime later (and even at different times for those fields) - as long as the correctness of what thread A does is not affected. Since no synchronization happens, multithreaded correctness is of no concern - thread A works as if it is the only thread accessing these fields.

Similarly thread B may keep size and elementData in CPU registers or in the caches, or it may update one or both from main memory as required (may be the cache line where size was kept has been flushed, requiring a reload of size from the main memory). Since no synchronization happens, thread B assumes that no other thread changes these fields, therefore cached values are always the same as those in main memory.

Upvotes: 3

Related Questions