prioritized queue while preventing stravation

Question

Here's my problem:
I have thousands of sensors to poll. Each sensor has a polling interval (not fixed, but rather changing over time).

So I need to poll them one after one, efficiently.
For that cause, I could use a priority queue (with poll-timestamps as keys).

The problem is that sensors could be neglected:
For example, if there's a sensor which needs be polled every 15 minutes and there are plenty of sensors need to be polled every 5 minutes then the priority queue could favor them over the first one.

So I could use few priority-queues instead of one, but the question of how to pick the next sensor to poll, remains.

What could be an ideal solution to my problem?

Thanks.

erickson · Accepted Answer

If you use the scheduled polling time as the priority, you might not be able to keep up with polling, but you won't have a starvation problem where some sensors are never polled.

Java's DelayQueue is a good fit for this. Store the scheduled time in the Delayed instance (based on System.nanoTime()), and then implement getDelay() to compute the difference between that and the current System.nanoTime().

Because DelayedQueue is a concurrent queue, multiple consumers can take tasks and run them, and reschedule new tasks independently. This will help keep up with the load and maintain your schedule, because while a slow sensor is being polled, another thread can be waiting for the next task to become eligible for execution.

Suppose you have a two sensors: sensor A that is polled every second, and sensor B that is polled every 3 seconds, and queue that is ordered by scheduled polling time.

time 0: A polled, rescheduled for time 1
        B polled, rescheduled for time 3
time 1: A polled, rescheduled for time 2
time 2: A polled, rescheduled for time 3
time 3: A polled, rescheduled for time 4
        B polled, rescheduled for time 6

The order of polling at time 3 isn't determinate, but B will definitely get its turn for polling before any sensors scheduled at time 4.

From an OP comment:

sensors don't get out of the queue, but rather their keys get updated

Then you are doing it wrong. A priority queue doesn't have keys, but if they did, they wouldn't support key mutation.

A queue is a collection to which producers add elements, and from which consumers remove elements. If its contents aren't changing, use a different collection.

If there are times when there's too much work to do everything on schedule, you have two fundamental options: do everything, and accept that some will be late, or skip some work, in order to complete the most important tasks in a timely manner. Which is appropriate depends on your application, but the first approach is fragile in the sense that you might never catch up.

Do everything late

If you have to do all the work, but don't care when, you might use two priorities. The first is a schedule, as described already. The second is an "importance" metric, which is used to prioritize tasks that are ready for execution.

A small number of worker threads, maybe just one, consume the DelayedQueue, waiting for tasks based on their schedule. However, instead of executing the task directly, these workers simply put tasks into another BlockingQueue based on their importance. A larger pool of workers consumes this queue, executing the most important tasks first.

If the average load is greater than the available capacity, this queue will continue to grow. Otherwise, all the work will be done eventually, and you can devise any policy you like to prioritize tasks once they are ready to execute.

Do something on time

It may be that a late sample is worthless, and it would be better to skip polling rather than get further behind. In that case, a "grace period" could be specified for each task. When it comes out of the DelayQueue, check to see whether the current time is within the grace period for execution. If so, go ahead and execute the task. Otherwise, discard the task and get the next. The grace period may be a function of the polling frequency, with longer periods having a longer grace period, while skipping high frequency tasks helps clear the queue quickly without missing samples for long periods.

Hybrid

You can combine the "grace period" idea with the first approach, giving the most important tasks the best chance to be executed on time, but skipping tasks when its too late for them to be useful in order to catch up.

Single-threading

All of these strategies are still useful with a single thread executing tasks, but staying on schedule is a pretty tall order when one thread is serially executing relatively long-running tasks like polling a physical sensor.

Even if the processor is single-core and only executes one thread at a time, it will still support thread scheduling through context-switching so that progress can be made on another task while others are blocked on I/O. This will do a much better job of preserving the desired schedule.

prioritized queue while preventing stravation

Answers (2)

Do everything late

Do something on time

Hybrid

Single-threading

Related Questions