Nicky
Nicky

Reputation: 65

Is replacing one java object reference with another considered thread safe, or are there potential synchronicity issues I'm overlooking?

I'm rolling my own simple caching solution to improve the UX of a lengthy lookup. The basic outline is that I have a class handling all accountId lookups, and another class called AccountListCache with two variables: a timestamp indicating when it was created, and a List object containing a whole bunch of accountId objects. When the lookup is invoked, that class will first check if its cache object has a timestamp from within the past X minutes. If so, it serves the cached results, and if not it refreshes the cache and serves the new list.

My plan is to have a scheduled task that regularly refreshes the cache by creating a new instance of the AccountListCache, populating its list/timestamp, and then re-assigning the variable in the lookup class. My boss brought up concerns about thread safety, but I'm thinking this should be thread safe already.

The old list and the new list are separate objects contained within separate instances of the same class, and the actual update call that matters would just be changing where the reference is pointing in memory. That should be an effectively instantaneous/atomic operation, right?

If it's not thread safe, what does a failure/conflict look like, and how might I resolve it?

Upvotes: 3

Views: 954

Answers (2)

Basil Bourque
Basil Bourque

Reputation: 338496

The Answer by erickson is correct.

As an alternative to volatile, I prefer using AtomicReference for two reasons:

  • Many programmers do not fully understand volatile, both in general and specifically because it’s meaning changed when the Java Memory Model was revised.
  • The AtomicReference really screams out to the human readers that we have a concurrency issue to manage.

Example of the atomic reference. Notice the final to guarantee that we have one and only one AtomicReference instance.

final AtomicReference < AccountListCache > cacheRef = new AtomicReference<>( … ) ;

I would use a record to hold your current cache value, a timestamp with a list. A record’s member field references are immutable (has getters but no setters).

record AccountListCache ( Instant whenFetched , List< Account > ) {}

Use List.copyOf to get an unmodifiable list.

You could add a constructor to the record for data validation. You could check that the passed instant is before now. And you could make sure the list is not empty (if that’s intolerable). And you could put the passed list through List.copyOf to ensure it is not modifiable (copyOf is a virtual no-op if already unmodifiable).

Use a ScheduledExecutorService to repeatedly replace the record reference in the AtomicReference with a reference to a new fresh instance of the AccountListCache record class.

Any code using the cache can verify how fresh or stale it may be.

Duration ageOfCache = 
    Duration.between ( 
        cacheRef.get().whenFetched() , 
        Instant.now() 
    ) 
;

Or make that a method on the record.

record AccountListCache ( Instant whenFetched , List< Account > ) 
{
    Duration age () { return Duration.between ( this.whenFetched , Instant.now() ) } 
}

Usage:

Duration ageOfCache = cacheRef.get().age() ;

Upvotes: 3

erickson
erickson

Reputation: 269667

Yes, since one thread is writing data to be read by other threads, there are concurrency issues here.

While writing and reading object references are atomic, in the sense that "tearing" is guaranteed not to occur (unlike double or long types), there is no guarantee that writes will ever be visible to other threads unless a memory barrier is used.

In this case, at a minimum, I would make the AccountListCache field in your lookup class volatile. This will ensure that any state of the cache modified before it is assigned to the field is visible to any thread that reads the field.


Update:

… I'd like to understand both why it fails and how, because I suspect I'm about to face trickier problems in the [near] future.

The authoritative answer to these questions is the Java Language Specification, §17.4 First, you should understand the concepts of "synchronizes-with" (§17.4.4) and "happens-before." (§17.4.5) Then you can refer to the list of synchronization points and happens-before relationships to design concurrent interactions that behave correctly.

In your case, one thread ("the writer") needs to to populate a list, then assign it to a field of another object, and have another thread ("the reader") read that field and access the list. Using the happens-before relationships in §17.4.5, we can construct this (partial) chain (where "hb(x, y)" means that x happens-before y):

  • If x and y are actions of the same thread and x comes before y in program order, then hb(x, y).
    • x—the cache is initialized (writer)
    • y—the reference to the cache is written to a field (writer)
  • If x and y are actions of the same thread and x comes before y in program order, then hb(x, y).
    • x—the reference to the cache is read from a field (reader)
    • y—the content of the cache is accessed (reader)

You might think this is sufficient. But note the key words "of the same thread." This permits operations to be re-ordered if they do not change the view of the world from that thread. In your case, perhaps the field referencing the cache might be assigned before fresh items are added to the new cache instance. That change wouldn't be visible inside the thread, but other threads could start reading the cache before it's populated.

When you make the cache field volatile, you introduce a "synchronizes-with" relationship between the write and subsequent reads, because "[a] write to a volatile variable v (§8.3.1.4) synchronizes-with all subsequent reads of v by any thread."

By making the field volatile, we can add more steps to the chain:

  • If an action x synchronizes-with a following action y, then we also have hb(x, y).
    • x—the cache is assigned to the volatile field (writer)
    • y—the cache is read from the volatile field (reader)
  • If hb(x, y) and hb(y, z), then hb(x, z).
    • x—the cache is initialized (writer)
    • y—the cache is assigned (writer)
    • z—the cache is read (reader)

Therefore, we have proved that the cache is initialized by the writer before it is read by the reader. Without the synchronizes-with relationship, that happens-before relationship is missing.

Upvotes: 3

Related Questions