Reputation: 2446
I am trying to design an azure worker role routine. A worker role polls a job queue. For each job, required number of threads is specified in the job message. A job is running an instance of an executable. Example: name of the executable is Rax.exe. Rax.exe can run on different number of threads. If we call it as Rax.exe -T 2 it will create two threads. So we do not have to deal with how to create threads. We just call Rax.exe with the appropriate commandline argument. I have Extra-Large worker instances. So, I can run 8 threads simultaneously. I want to utilize the workers as much as I can. We may have many jobs, each having different number of threads specified.
Example:
Job Queue:
1 Rax.exe -T 3
2 Rax.exe -T 5
3 Rax.exe -T 1
4 Rax.exe -T 8
5 Rax.exe -T 4
In this example, we have 5 jobs. A worker reads the first message and starts the job. This job consumes 3 threads. A worker can have 8 threads so the remaining 5 threads can be utilized by running another job from the queue.
Currently, I do not know how to run multiple processes inside of a worker role. I am using waitForExit method of the process class. Each running instance of the executable creates output files so I have to collect those generated files.
My Questions:
1- How can I start multiple processes asynchronously and be notified when they exit? I have to do this while still polling the job queue. 2- Is this kind of job scheduling a hard problem? Can anyone come up with a good heuristic?
EDIT: I think, estimating required running time for each job will be helpful. This kind of information exist. With this information, can it be solved?
Upvotes: 1
Views: 1064
Reputation: 675
You should use Multiple Worker Role instances.
This is how Multi-Processing is done in the Azure Platform/paradigm. You can have more than one Role Instance grabbing items off of the same Queue, which is how the system has been designed.
Upvotes: 0
Reputation: 66882
1- How can I start multiple processes asynchronously and be notified when they exit? I have to do this while still polling the job queue.
This ones quite simple - instead of using WaitForExit
, you can subscribe to the Exited
event
2- Is this kind of job scheduling a hard problem? Can anyone come up with a good heuristic?
As Erno has suggested in his comment, one good way to solve this problem is to pass the problem off to the Parallel Task API. While a general multi-thread scheduling algorithm might not provide the "most optimal" scheduling solution, it can provide a really good solution for very little effort - and with the complexity of work, then sometimes the general scheduling algorithm can outperform a hand-crafted solution..
If you are interested in scheduling approaches for batch processes on Azure, then it might be worth looking at some of the map-reduce type projects on Azure:
While these approaches are mainly about distributing work across multiple machines, the same kind of approach can apply to distributing work across multiple cores within the same machine.
Upvotes: 1