Kevin Postlewaite
Kevin Postlewaite

Reputation: 615

Using JGraphT to Manage Ordering of Dependent Tasks

I have a list of tasks that have dependencies between them and I was considering how I could use JGraphT to manage the ordering of the tasks. I would set up the graph as a directed graph and remove vertices as I processed them (or should I mask them?). I could use TopologicalOrderIterator if I were only going to execute one task at a time but I'm hoping to parallelize the tasks. I could get TopologicalOrderIterator and check Graphs.vertexHasPredecessors until I find as many as I want to execute as once but ideally, there would be something like Graphs.getVerticesWithNoPredecessors. I see that Netflix provides a utility to get leaf vertices, so I could reverse the graph and use that, but it's probably not worth it. Can anyone point me to a better way? Thanks!

Upvotes: 1

Views: 693

Answers (1)

Joris Kinable
Joris Kinable

Reputation: 2411

A topological order may not necessary be what you want. Here's an example why not. Given the following topological ordering of tasks: [1,2,3,4], and the arcs (1,3), (2,3). That is, task 1 needs to be completed before task 3, similar for 2 and 4. Let's also assume that task 1 takes a really long time to complete. So we can start processing tasks 1, and 2 in parallel, but you cannot start 3 before 1 completes. Even though task 2 completes, we cannot start task 4 because task 3 is the next task in our ordering and this task is being blocked by 1.

Here's what you could do. Create an array dep[] which tracks the number of unfulfilled dependencies per task. So dep[i]==0 means that all dependencies for task i have been fulfilled, meaning that we can now perform task i. If dep[i]>0, we cannot perform task i yet. Lets assume that there is a task j which needs to be performed prior to task i. As soon as we complete task j, we can decrement the number of unfulfilled dependencies of task i, i.e: dep[i]=dep[i]-1. Again, if dep[i]==0, we are now ready to process task i. So in short, the algorithm in pseudocode would look like this:

  1. Initialize dep[] array.
  2. Start processing in parallel all tasks i with dep[i]==0
  3. if a task i completes, decrement dep[j] for all tasks j which depend on i. If task j has dep[j]==0, start processing it.

You could certainly use a Directed Graph to model the dependencies. Each time you complete a task, you could simply iterate over the outgoing neighbors (in jgrapht use the successorsOf(vertex) function). The DAG can also simply be used to check feasibility: if the graph contains a cycle, you have a problem in your dependencies. However, if you don't need this heavy machinery, I would simply create a 2-dimensional array where for each task i you store the tasks that are dependent on i.

The resulting algorithm runs in O(n+m) time, where n is the number of tasks and m the number of arcs (dependencies). So this is very efficient.

Upvotes: 2

Related Questions