Jordi
Jordi

Reputation: 23237

java concurrency: CopyOnWriteArrayList strategy

I'm trying to understand CopyOnWriteArrayList into my code:

My code is:

public class AuditService {
    private CopyOnWriteArrayList<Audit> copyWrite;

    public void flush(Audit... audits) {
        Collection<Audit> auditCollection = Arrays.asList(audits);
        this.copyWrite.addAll(auditCollection);

        this.copyWrite.forEach(audit -> {
            try {
              // save audit object on database
              this.copyWrite.remove(audit);
            } catch (DataAccessException e) {
              // log it
            }

        });
    }
}

What this code do is:

  1. First stores audits into a a buffer, CopyOnWriteArrayList.
  2. Tries to save an audit to database
  3. When it's been stored, it's removed from buffer CopyOnWriteArrayList.

Others:

  1. AuditService is a singleton class
  2. flush method can be reached by several threads.

Questions:

  1. I guess that this.copyWrite.forEach(audit -> {... can be reached at the same time by several threads: Does it mean a same audit object can be tried to be saved on database twice?
  2. Each time a modification operation is made on CopyOnWriteArrayList, a new copy is populated on others threads? How is it populated?

Upvotes: 1

Views: 337

Answers (1)

Alexandre Dupriez
Alexandre Dupriez

Reputation: 3036

Every time remove is called, a new copy of the internal array backing the CopyOnWriteArrayList is made. Future accesses to that list with accessor and mutator methods will have visibility on the update.

However, the method CopyOnWriteArrayList#foreach method iterates on the array which was available at the time it was called. This means all executions of the method flush which entered the foreach prior to any update on the list will iterate on a stale version of the array.

As such, during parallel executions of the method flush, the same Audit elements will be persisted more than once, and up to d times if d is the number of maximal concurrent executions of the method flush.

Another problem with the use of CopyOnWriteArrayList in this case is since a new copy is made for every call to remove, the complexity of your code is d.n^2 where n is the length of the list and d defined above.

The CopyOnWriteArrayList is not the correct implementation to use here. There are multiple possible suitable designs possible. One of them is to use a LinkedBlockingQueue as follows [*]:

public void flush(Audit... audits) {
    Collection<Audit> auditCollection = Arrays.asList(audits);
    this.queue.addAll(auditCollection);

    Collection<Audit> poissonedAudits = new ArrayList<Audit>();
    Audit audit = null;
    while ((audit = this.queue.poll()) != null) {
       try {
          // save audit object on database
          queue.remove(audit);
        } catch (DataAccessException e) {
          // log it
          poissonedAudits.add(audit);
        }
    }
    this.queue.addAll(poissonedAudits);
}

The call to LinkedBlockingQueue#poll() is thread-safe and atomic. The same element can never be polled multiple times (as long as it isn't added to the queue more than once, cf [*]). Complexity is linear in n.

Two points to consider:

  • Your collection of audit must not contain null elements, as this is forbidden for a LinkedBlockingQueue, because null is used to indicate the list is empty.
  • You should take care not to use blocking methods such as take of poll(timeout, unit) to poll the queue.

[*] There has been a change in the flush method which new does not perform a copy of Audit elements. I am not sure whether there is a guarantee these elements are distinct if the flush method is called in parallel. If the array of Audit is the same for all invocations of flush, the queue should be populated only once, prior to any call to flush.

Upvotes: 2

Related Questions