PiotrK
PiotrK

Reputation: 1552

Synchronizing a sequence of asynchronous calls

I'm using JavaFX's WebView to parse a website. The site contains a bunch of links - I need to open each of them separately, in a given order, and retrieve one information from each of them.

In order to make sure that WebView has loaded the whole site, I'm listening to changed event of WebEngine and waiting for newState == Worker.State.SUCCEEDED. The problem is that this call is asynchronous. When I'm calling webEngine.load(firstAddress);, the code immediately returns and before this page will have been loaded, my code will call another webEngine.load(secondAddress);, and so on.

I understand why it's done this way (why async is better than sync), but I'm a beginner in Java and I'm not sure what's the best solution to this problem. I somehow understand multithreading and stuff, so I've already tried a semaphore (CountDownLatch class). But the code hangs on await and I'm not sure what I'm doing wrong.

Could someone please show me how it should be done the right way? Maybe some universal pattern how to cope with scenarios like this?

A pseudocode of what I want to achieve:

WebEngine webEngine = new WebEngine();
webEngine.loadPage("http://www.something.com/list-of-cars");
webEngine.waitForThePageToLoad(); // I need an equivalent of this. In the real code, this is done asynchronously as a callback
// ... some HTML parsing or DOM traversing ...
List<String> allCarsOnTheWebsite = webEngine.getDocument()....getChildNodes()...;
// allCarsOnTheWebsite contains URLs to the pages I want to analyze

for (String url : allCarsOnTheWebsite)
{
    webEngine.loadPage(url);
    webEngine.waitForThePageToLoad(); // same as in line 3

    String someDataImInterestedIn = webEngine.getDocument()....getChildNodes()...Value();
    System.out.println(url + " : " + someDataImInterestedIn);
}

System.out.println("Done, all cars have been analyzed");

Upvotes: 1

Views: 619

Answers (2)

James_D
James_D

Reputation: 209408

You should use listeners which get invoked when the page is loaded, instead of blocking until it's done.

Something like:

WebEngine webEngine = new WebEngine();
ChangeListener<State> initialListener = new ChangeListener<State>() {
    @Override
    public void changed(ObservableValue<? extends State> obs, State oldState, State newState) {
        if (newState == State.SUCCEEDED) {
            webEngine.getLoadWorker().stateProperty().removeListener(this);
            List<String> allCarsOnTheWebsite = webEngine.getDocument()... ;
            loadPagesConsecutively(allCarsOnTheWebsite, webEngine);
        }
    }
};
webEngine.getLoadWorker().addListener(initialListener);      
webEngine.loadPage("http://www.something.com/list-of-cars");

// ...

private void loadPagesConsecutively(List<String> pages, WebEngine webEngine) {
    LinkedList<String> pageStack = new LinkedList<>(pages);
    ChangeListener<State> nextPageListener = new ChangeListener<State>() {
        @Override
        public void changed(ObservableValue<? extends State> obs, State oldState, State newState) {
            if (newState == State.SUCCEEDED ) {
                // process current page data
                // ...
                if (pageStack.isEmpty()) {
                    webEngine.getLoadWorker().stateProperty().removeListener(this);
                } else {
                    // load next page:
                    webEngine.load(pageStack.pop());
                }
            }               
        }
    };
    webEngine.getLoadWorker().stateProperty().addListener(nextPageListener);

    // load first page (assumes pages is not empty):
    webEngine.load(pageStack.pop());
}

Upvotes: 2

James_D
James_D

Reputation: 209408

If you want to run all the tasks concurrently, but process them in the order they were submitted, have a look at the following example:

import java.util.ArrayList;
import java.util.List;
import java.util.Random;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;

import javafx.application.Application;
import javafx.application.Platform;
import javafx.concurrent.Task;
import javafx.scene.Scene;
import javafx.scene.control.ListView;
import javafx.scene.layout.BorderPane;
import javafx.stage.Stage;

public class ProcessTaskResultsSequentially extends Application {

    @Override
    public void start(Stage primaryStage) {
        ListView<String> results = new ListView<>();

        List<Task<Integer>> taskList = new ArrayList<>();
        for (int i = 1; i<= 10 ; i++) {
            taskList.add(new SimpleTask(i));
        }

        ExecutorService exec = Executors.newCachedThreadPool(r -> {
            Thread t = new Thread(r);
            t.setDaemon(true);
            return t ;
        });


        Thread processThread = new Thread(() -> {
            for (Task<Integer> task : taskList) {
                try {
                    int result = task.get();
                    Platform.runLater(() -> {
                        results.getItems().add("Result: "+result);
                    });
                } catch (Exception e) {
                    e.printStackTrace();
                }
            }
        });

        processThread.setDaemon(true);
        processThread.start();

        taskList.forEach(exec::submit);

        primaryStage.setScene(new Scene(new BorderPane(results), 250, 400));
        primaryStage.show();
    }

    public static class SimpleTask extends Task<Integer> {
        private final int index ;

        private final static Random rng = new Random();

        public SimpleTask(int index) {
            this.index = index ;
        }

        @Override
        public Integer call() throws Exception {
            System.out.println("Task "+index+" called");
            Thread.sleep(rng.nextInt(1000)+1000);
            System.out.println("Task "+index+" finished");
            return index ;
        }
    }

    public static void main(String[] args) {
        launch(args);
    }
}

Upvotes: 0

Related Questions