Reputation:

ABA with Clojure Software Transactional Memory

I was wondering wether Clojure has a built in solution for the ABA-problem. I was creating an example that shows this problem, but somehow Clojure detects the changes. Is this because Clojure's transactions compare the references and not the values?

My example:

(def x (ref 42))
(def io (atom false))
(def tries (atom 0))

(def t1 (Thread. (fn [] (dosync (commute x - 42)))))
(def t2 (Thread. (fn [] (dosync 
                          (Thread/sleep 100)
                          (commute x + 42)))))
(def t3 (Thread. 
  (fn [] 
    (dosync 
      (do 
        (Thread/sleep 1000) 
        (swap! tries inc) 
        (if (= 42 @x) 
          (reset! io true)))))))

(.start t3)
(.start t1)
(.start t2)

(.join t1)
@x
(.join t2)
@x
(.join t3)
@tries
(if (= true @io) (println "The answer is " @x))

The tries count is always 2, so the transaction t3 must have noticed the ref changes of t1 and t2. Does someone know the cause of this behavior?

Upvotes: 0

Answers (2)

Michał Marczyk

Reputation: 84331

Before answering the question at hand, let me just say that by far the best source of information on Clojure's STM – besides the source code itself – that I am aware of is Mark Volkmann's Software Transactional Memory article (the link points at a changelog page, follow the link to the latest version from there). It's incredibly comprehensive. (Don't worry about the 2009 timestamp, the STM doesn't change much.) If you want to think through exactly how things work in scenarios such as this one, I highly recommend reading it.

As for the scenario at hand:

For an in-transaction read of a Ref, the STM promises to return a value that was committed before the current transaction try. (Unless of course the current transaction try has itself set the in-transaction value of the Ref in question.) This value may or may not be the most recent value written to the Ref, however if it is not, the read needs to be satisfied from the Ref's history. If the Ref's history contains no such value, then a fault is recorded for the Ref and the transaction retries. Subsequently the length of the Ref's history chain may be increased because of the fault, up to the Ref's maximum history length (10 by default), though note that this will only happen if there's an opportunity (another write to the Ref) and will only be of help to transactions started "sufficiently late" (so that their timestamps are later than those of some value recorded in the history).

In the case at hand, by the time t3 gets round to reading the Ref, t1 and t2 will have completed their writes to x with no issues and x will no longer be able to satisfy a read request demanding a value from before t3's first try. (This is because a Ref's history chain by default starts out at length 0, meaning no historical values are preserved.) So t3 has to record a fault for x and retry.

(If you rerun the three transactions against the same Ref and helper Atoms – say, by pasting all except the top three lines into your REPL again – you'll see tries jump to 4 on the second run, and then to 5 on the third, indicating that at that point a historical value is available.)

On the ABA problem:

The ABA problem is not relevant to the STM, because in a proper ABA scenario the "B" is written to the memory location (1) by a different thread and (2) after the first read of "A" by the "main" thread (the one that's meant to suffer from the ABA problem), and then similarly the second "A" is written (1) by a different thread and (2) after the "B" write, and both "As" are observed by the main thread, but the "B" is not – but as explained above, in an STM transaction you cannot observe a value written to the Ref by a different thread after your transaction try started, so if you observe the first "A", you won't be able to observe the "B" or the second "A".

That does not mean that no concurrency-related problems are possible with the STM – it's fairly easy to run into write skew (described in the Wikipedia article on snapshot isolation – this is what the ensure function is designed to fix, but it's up to user code to call it where appropriate), commute can be misused &c.

Upvotes: 1

Alan Thompson

Reputation: 29958

You are correct that this is the expected behavior (although I would have expected tries to be 1). Besides the many Clojure books discussing Software Transactional Memory (STM), you may also wish to review

Also, it is usually best to use alter instead of commute, which is easy to get wrong and is usually a case of "premature optimization".

Upvotes: 0

ABA with Clojure Software Transactional Memory

Answers (2)

Related Questions