SwingWorker, done() is executed before process() calls are finished

Question

I have been working with SwingWorkers for a while and have ended up with a strange behavior, at least for me. I clearly understand that due to performance reasons several invocations to publish() method are coallesced in one invocation. It makes perfectly sense to me and I suspect SwingWorker keeps some kind of queue to process all that calls.

According to tutorial and API, when SwingWorker ends its execution, either doInBackground() finishes normally or worker thread is cancelled from the outside, then done() method is invoked. So far so good.

But I have an example (similar to shown in tutorials) where there are process() method calls done after done() method is executed. Since both methods execute in the Event Dispatch Thread I would expect done() be executed after all process() invocations are finished. In other words:

Expected:

Writing...
Writing...
Stopped!

Result:

Writing...
Stopped!
Writing...

Sample code

import java.awt.BorderLayout;
import java.awt.Dimension;
import java.awt.Graphics;
import java.awt.event.ActionEvent;
import java.util.List;
import javax.swing.AbstractAction;
import javax.swing.Action;
import javax.swing.JButton;
import javax.swing.JFrame;
import javax.swing.JPanel;
import javax.swing.JScrollPane;
import javax.swing.JTextArea;
import javax.swing.SwingUtilities;
import javax.swing.SwingWorker;

public class Demo {

    private SwingWorker worker;
    private JTextArea textArea;
    private Action startAction, stopAction;

    private void createAndShowGui() {

        startAction = new AbstractAction("Start writing") {
            @Override
            public void actionPerformed(ActionEvent e) {
                Demo.this.startWriting();
                this.setEnabled(false);
                stopAction.setEnabled(true);
            }
        };

        stopAction = new AbstractAction("Stop writing") {
            @Override
            public void actionPerformed(ActionEvent e) {
                Demo.this.stopWriting();
                this.setEnabled(false);
                startAction.setEnabled(true);
            }
        };

        JPanel buttonsPanel = new JPanel();
        buttonsPanel.add(new JButton(startAction));
        buttonsPanel.add(new JButton(stopAction));

        textArea = new JTextArea(30, 50);
        JScrollPane scrollPane = new JScrollPane(textArea);

        JFrame frame = new JFrame("Test");
        frame.setDefaultCloseOperation(JFrame.DISPOSE_ON_CLOSE);
        frame.add(scrollPane);
        frame.add(buttonsPanel, BorderLayout.SOUTH);
        frame.pack();
        frame.setLocationRelativeTo(null);
        frame.setVisible(true);
    }

    private void startWriting() {
        stopWriting();
        worker = new SwingWorker() {
            @Override
            protected Void doInBackground() throws Exception {
                while(!isCancelled()) {
                    publish("Writing...
");
                }
                return null;
            }

            @Override
            protected void process(List chunks) {
                String string = chunks.get(chunks.size() - 1);
                textArea.append(string);
            }

            @Override
            protected void done() {
                textArea.append("Stopped!
");
            }
        };
        worker.execute();
    }

    private void stopWriting() {
        if(worker != null && !worker.isCancelled()) {
            worker.cancel(true);
        }
    }

    public static void main(String[] args) {
        SwingUtilities.invokeLater(new Runnable() {
            @Override
            public void run() {
                new Demo().createAndShowGui();
            }
        });
    }
}

DSquare · Accepted Answer

SHORT ANSWER:

This happens because publish() doesn't directly schedule process, it sets a timer which will fire the scheduling of a process() block in the EDT after DELAY. So when the worker is cancelled there is still a timer waiting to schedule a process() with the data of the last publish. The reason for using a timer is to implement the optimization where a single process may be executed with the combined data of several publishes.

LONG ANSWER:

Let's see how publish() and cancel interact with each other, for that, let us dive into some source code.

First the easy part, cancel(true):

public final boolean cancel(boolean mayInterruptIfRunning) {
    return future.cancel(mayInterruptIfRunning);
}

This cancel ends up calling the following code:

boolean innerCancel(boolean mayInterruptIfRunning) {
    for (;;) {
        int s = getState();
        if (ranOrCancelled(s))
            return false;
        if (compareAndSetState(s, CANCELLED)) // <-----
            break;
    }
    if (mayInterruptIfRunning) {
        Thread r = runner;
        if (r != null)
            r.interrupt(); // <-----
    }
    releaseShared(0);
    done(); // <-----
    return true;
}

The SwingWorker state is set to CANCELLED, the thread is interrupted and done() is called, however this is not SwingWorker's done, but the future done(), which is specified when the variable is instantiated in the SwingWorker constructor:

future = new FutureTask(callable) {
    @Override
    protected void done() {
        doneEDT();  // <-----
        setState(StateValue.DONE);
    }
};

And the doneEDT() code is:

private void doneEDT() {
    Runnable doDone =
        new Runnable() {
            public void run() {
                done(); // <-----
            }
        };
    if (SwingUtilities.isEventDispatchThread()) {
        doDone.run(); // <-----
    } else {
        doSubmit.add(doDone);
    }
}

Which calls the SwingWorkers's done() directly if we are in the EDT which is our case. At this point the SwingWorker should stop, no more publish() should be called, this is easy enough to demonstrate with the following modification:

while(!isCancelled()) {
    textArea.append("Calling publish
");
    publish("Writing...
");
}

However we still get a "Writing..." message from process(). So let us see how is process() called. The source code for publish(...) is

protected final void publish(V... chunks) {
    synchronized (this) {
        if (doProcess == null) {
            doProcess = new AccumulativeRunnable() {
                @Override
                public void run(List args) {
                    process(args); // <-----
                }
                @Override
                protected void submit() {
                    doSubmit.add(this); // <-----
                }
            };
        }
    }
    doProcess.add(chunks);  // <-----
}

We see that the run() of the Runnable doProcess is who ends up calling process(args), but this code just calls doProcess.add(chunks) not doProcess.run() and there's a doSubmit around too. Let's see doProcess.add(chunks).

public final synchronized void add(T... args) {
    boolean isSubmitted = true;
    if (arguments == null) {
        isSubmitted = false;
        arguments = new ArrayList();
    }
    Collections.addAll(arguments, args); // <-----
    if (!isSubmitted) { //This is what will make that for multiple publishes only one process is executed
        submit(); // <-----
    }
}

So what publish() actually does is adding the chunks into some internal ArrayList arguments and calling submit(). We just saw that submit just calls doSubmit.add(this), which is this very same add method, since both doProcess and doSubmit extend AccumulativeRunnable, however this time around V is Runnable instead of String as in doProcess. So a chunk is the runnable that calls process(args). However the submit() call is a completely different method defined in the class of doSubmit:

private static class DoSubmitAccumulativeRunnable
     extends AccumulativeRunnable implements ActionListener {
    private final static int DELAY = (int) (1000 / 30);
    @Override
    protected void run(List args) {
        for (Runnable runnable : args) {
            runnable.run();
        }
    }
    @Override
    protected void submit() {
        Timer timer = new Timer(DELAY, this); // <-----
        timer.setRepeats(false);
        timer.start();
    }
    public void actionPerformed(ActionEvent event) {
        run(); // <-----
    }
}

It creates a Timer that fires the actionPerformed code once after DELAY miliseconds. Once the event is fired the code will be enqueued in the EDT which will call an internal run() which ends up calling run(flush()) of doProcess and thus executing process(chunk), where chunk is the flushed data of the arguments ArrayList. I skipped some details, the chain of "run" calls is like this:

doSubmit.run()
doSubmit.run(flush()) //Actually a loop of runnables but will only have one (*)
doProcess.run()
doProcess.run(flush())
process(chunk)

(*)The boolean isSubmited and flush() (which resets this boolean) make it so additional calls to publish don't add doProcess runnables to be called in doSubmit.run(flush()) however their data is not ignored. Thus executing a single process for any amount of publishes called during the life of a Timer.

All in all, what publish("Writing...") does is scheduling the call to process(chunk) in the EDT after a DELAY. This explains why even after we cancelled the thread and no more publishes are done, still one process execution appears, because the moment we cancel the worker there's (with high probability) a Timer that will schedule a process() after done() is already scheduled.

Why is this Timer used instead of just scheduling process() in the EDT with an invokeLater(doProcess)? To implement the performance optimization explained in the docs:

Because the process method is invoked asynchronously on the Event Dispatch Thread multiple invocations to the publish method might occur before the process method is executed. For performance purposes all these invocations are coalesced into one invocation with concatenated arguments. For example:
 publish("1");
 publish("2", "3");
 publish("4", "5", "6");

might result in:
 process("1", "2", "3", "4", "5", "6")

We now know that this works because all the publishes that occur within a DELAY interval are adding their args into that internal variable we saw arguments and the process(chunk) will execute with all that data in one go.

IS THIS A BUG? WORKAROUND?

It's hard to tell If this is a bug or not, It might make sense to process the data that the background thread has published, since the work is actually done and you might be interested in getting the GUI updated with as much info as you can (if that's what process() is doing, for example). And then it might not make sense if done() requires to have all the data processed and/or a call to process() after done() creates data/GUI inconsistencies.

There's an obvious workaround if you don't want any new process() to be executed after done(), simply check if the worker is cancelled in the process method too!

@Override
protected void process(List chunks) {
    if (isCancelled()) return;
    String string = chunks.get(chunks.size() - 1);
    textArea.append(string);
}

It's more tricky to make done() be executed after that last process(), for example done could just use also a timer that will schedule the actual done() work after >DELAY. Although I can't think this is would be a common case since if you cancelled It shouldn't be important to miss one more process() when we know that we are in fact cancelling the execution of all the future ones.

SwingWorker, done() is executed before process() calls are finished

Expected:

Result:

Sample code

Answers (2)

Related Questions