ARKBAN
ARKBAN

Reputation: 3477

Sorting algorithm for a non-comparison based sort problem?

I am currently faced with a difficult sorting problem. I have a collection of events that need to be sorted against each other (a comparison sort) and against their relative position in the list.

In the simplest terms I have list of events that each have a priority (integer), a duration (seconds), and an earliest occurrence time that the event can appear in the list. I need to sort the events based on priority, but no event can appear in the list before its earliest occurrence time. Here's an example to (hopefully) make it clearer:

// Psuedo C# code
class Event { int priority; double duration; double earliestTime ; }

void Example()
{
    Event a = new Event { priority = 1, duration = 4.0, earliestTime = 0.0 };
    Event b = new Event { priority = 2, duration = 5.0, earliestTime = 6.0 };
    Event c = new Event { priority = 3, duration = 3.0, earliestTime = 0.0 };
    Event d = new Event { priority = 4, duration = 2.0, earliestTime = 0.0 };

    // assume list starts at 0.0 seconds
    List<Event> results = Sort( new List<Event> { a, b, c, d } );

    assert( results[ 0 ] == a ); // 4.0 seconds elapsed
    assert( results[ 1 ] == c ); // 7.0 seconds elapsed
    assert( results[ 2 ] == b ); // 12.0 seconds elapsed
    assert( results[ 3 ] == d ); // 14.0 seconds elapsed
}

Item "b" has to come last because it isn't allowed to start until 6.0 seconds into the list, so it is deferred and "c" gets to go before "b" even though its priority is lower. (Hopefully the above explains my problem, if not let me know and I'll edit it.)

My current idea is to use an insertion sort to manage the sorting process. Unlike many of the other common sorting algorithms, insertion sort decides the order of the list one at a time and in order. So for each index I should be able to find the next lowest priority event whose earliest occurrence time will be satisfied.

I'm hoping to find resources about sorting algorithms and data structures to help me design a good solution for this "sort" of problem. My real problem is actually more complex than this: hierarchical sorting, variable buffers between events, multiple non-constant time constraints, so the more information or ideas the better. Speed and space are not really a concern. Accuracy in sorting and maintainability of the code are a concern.

Edit: Clarifications (based on comments)

Edit: Answer

While David Nehme gave the answer I selected, I wanted to point out that his answer is an insertion sorts at heart, and several other people provided insertions sort type answers. This confirms for me that a specialized insertion sort is probably the way to go. Thanks to all of you for your answers.

Upvotes: 18

Views: 2321

Answers (9)

P Daddy
P Daddy

Reputation: 29527

It sounds like you do actually do want a comparison-based sort. Your sort key is {earliestTime, priority}, in that order. Since your example is pseudo-C#, I'll give you a pseudo-C# solution:

class Event : IComparable<Event>, IComparable{
    int    priority;
    double duration;
    double earliestTime;

    public int CompareTo(Event other){
        if(other == null)
            return 1; /* define: non-null > null */

        int cmp = earliestTime.CompareTo(other.earliestTime);
        if(cmp != 0)
            return cmp;

        /* earliestTimes were equal, so move on to next comparison */
        return priority.CompareTo(other.priority);
    }

   int IComparable.CompareTo(object other){ /* for compatibility with non-generic collections */
       if(other == null)
            return 1; /* define: non-null > null */

       Event e_other = other as Event;
       if(e_other == null) /* must have been some other type */
           throw new ArgumentException("Must be an Event", "other");

       return CompareTo(e_other); /* forward to strongly-typed implementation */
   }
}

Now your list will sort just as your asserts expect.

EDIT:

My initial presumption was that events would be picked off the list and handed off to a separate thread, so that the queue manager could fire off the next event on time, but from comments I received, I got the idea that perhaps an approach that was single-threaded, yet still allowed higher-priority events to fire off as close as possible to their start time was more desirable. In that case, the CompareTo function should change as follows:

public int CompareTo(Event other){
    if(other == null)
        return 1; /* define: non-null > null */

    int cmp = priority.CompareTo(other.priority);

    if(cmp == 0)
        /*
         * calculate and compare the time each event will be late
         * if the other one were to start first.  This time may be
         * negative if starting one will not make the other one late
         */
        return (earliestTime + duration - other.earliestTime).CompareTo(
            other.earliestTime + other.duration - earliestTime);

    /*
     * they're different priorities. if the lower-priority event
     * (presume that greater priority index means lower priority,
     * e.g. priority 4 is "lower" priority than priority 1), would
     * would make the higher-priority event late, then order the
     * higher-priority one first.  Otherwise, just order them by
     * earliestTime.
     */
    if(cmp < 0){/* this one is higher priority */
        if(earliestTime <= other.earliestTime)
            /* this one must start first */
            return -1;

        if(other.earliestTime + other.duration <= earliestTime)
            /* the lower-priority event would not make this one late */
            return 1;

        return -1;
    }

    /* this one is lower priority */
    if(other.earliestTime <= earliestTime)
        /* the other one must start first */
        return 1;

    if(earliestTime + duration <= other.earliestTime)
        /* this event will not make the higher-priority one late */
        return -1;

    return 1;
}

Test this against any assumptions, but I think it's what we're looking for.

Upvotes: 0

David Nehme
David Nehme

Reputation: 21572

This is actually more than a sorting problem. It's a single-machine scheduling problem with release dates. Depending on what you are trying to do, the problem might be NP-Hard. For example, if you are trying to mimimize the weighted-sum of the completion times (the weight being inversely proportional to the priority), the the problem is categorized as

1|ri;pmtn|Σ wiCi 

and is NP-hard. There are numerous papers on this topic, but it might be more than what you need.

In your case, you never want a solution with gaps, so what you might just need to do is a simple discrete-event simulation ( O(n log(n)) ) time. You need to store released_jobs as a priority queue.

unreleased_jobs = jobs  // sorted list of jobs, by release date
released_jobs = {}      // priority queue of jobs, by priority
scheduled_jobs = {}     // simple list
while (!unreleased_jobs.empty() || !released_jobs.empty()) {

    while (unreleased_jobs.top().earliestTime  <= t) {
        released_jobs.push(unreleased_jobs.pop())
    }
    if (!released_jobs.empty()) {
       next_job = released_jobs.pop();
       scheduled_jobs.push_back(next_job)
       t = t + next_job.duration
    } else {
       // we have a gap
       t = unreleased_jobs.top().earliestTime
    }
}

One problem is that you might have a low-priority job with a release time just before a short, high-priority job, but it will produce a schedule with the property that there are no gaps (if a schedule with no gaps is possible).

Upvotes: 12

Federico A. Ramponi
Federico A. Ramponi

Reputation: 47075

Here is some Python code along the lines of Douglas's answer. First we sort by priority, then we fit a timeline in a selection-sort fashion:

#!/usr/bin/env python
MIN_PRIORITY = 100

class Event(object):
    def __init__(self, name, priority, duration, earliestTime):
        self.name = name
        self.priority = priority
        self.duration = duration
        self.earliestTime = earliestTime
    def __str__(self):
        return "%-10s:  P %3d  D %3.1f  T %3.1f" % (self.name, self.priority, self.duration, self.earliestTime)

def sortEvents(_events):
    def comparePriority(event1, event2):
        if event1.priority < event2.priority: return -1
        if event1.priority > event2.priority: return 1
        return 0

    # Get a copy of the events and sort by priority
    events = [e for e in _events]
    events.sort(cmp=comparePriority)

    # Select one event at a time, checking for compatibility with elapsed time
    elapsedTime = 0.0
    sortedEvents = []
    while events:
        minGap = events[0].earliestTime - elapsedTime
        for e in events:
            currentGap = e.earliestTime - elapsedTime
            if currentGap < minGap:
                minGap = currentGap
            if currentGap <= 0.0:
                sortedEvents.append(e)
                elapsedTime += e.duration
                events.remove(e)
                break

        # If none of the events fits, add a suitable gap
        if minGap > 0:
            sortedEvents.append( Event("gap", MIN_PRIORITY, minGap, elapsedTime) )
            elapsedTime += minGap
    return sortedEvents

if __name__ == "__main__":
    #e1 = Event("event1", 1, 1.0, 4.0)
    #e2 = Event("event2", 2, 1.0, 6.0)
    #e3 = Event("event3", 3, 1.0, 8.0)
    #e4 = Event("event4", 4, 1.0, 10.0)

    e1 = Event("event1", 1, 4.0, 0.0)
    e2 = Event("event2", 2, 5.0, 6.0)
    e3 = Event("event3", 3, 3.0, 0.0)
    e4 = Event("event4", 4, 2.0, 0.0)

    events = [e1, e2, e3, e4]

    print "Before:"
    for event in events: print event
    sortedEvents = sortEvents(events)
    print "\nAfter:"
    for event in sortedEvents: print event

Upvotes: 0

rmeador
rmeador

Reputation: 25696

I'm not entirely certain I understand the intricacies of your problem, but my instinct tells me you need to define a relationship between priority and start time. The example would be:

Event a = new Event { priority = 1, duration = 4.0, earliestTime = 1.0 };
Event b = new Event { priority = 2, duration = 5.0, earliestTime = 0.0 };

So, do we go ahead and start b at time = 0, or do we wait for a tick and then start a because it's higher priority? Suppose there were more events with more priorities and longer time tradeoffs. I think you need a rule along the lines of "if the next event is X higher priority and the gap (between now and the earliest time) is less than Y seconds, wait and then start the higher priority event. otherwise start the low priority event (thus pushing back the high priority one)".

Upvotes: 0

Douglas Leeder
Douglas Leeder

Reputation: 53310

I think:

  1. Sort tasks by priority
  2. Fit tasks into a time-line, taking the first available slot after their earliestTime, that has a hole big enough for the task.

Convert the time-line into a list a tasks, and waits (for the gaps).

Questions:

  1. Are gaps allowed?
  2. Can tasks be split?
  3. Given the tasks as in the question: is it better to delay b to complete c, or do d so that b can start on time?

Edit:

Os the answers to my questions are:

  1. No (ish - if there is nothing to run I guess we could have a gap)
  2. No
  3. Still not clear, but I guess the example suggests run c and delay b.

In this case the algorithm might be:

  1. Sort by priority
  2. Keep a counter for the current 'time' starting with t=0
  3. Search though the sorted list, for the highest priority item that can be started at t.
  4. Add the item to the running order, and add its duration to t.
  5. Repeat 3&4 till the list is exhausted. If there are no tasks runnable at t, and there are tasks remaining pending, stick a 1-second sleep task in the running order.

This algorithm is also O(n^2).

Upvotes: 3

Federico A. Ramponi
Federico A. Ramponi

Reputation: 47075

Incidentally, in the most general case there may be no solution (unless gaps are allowed, as Douglas pointed out). For example:

Event a = new Event { priority = 1, duration = 1.0, earliestTime = 4.0 };
Event b = new Event { priority = 2, duration = 1.0, earliestTime = 4.0 };
Event c = new Event { priority = 3, duration = 1.0, earliestTime = 4.0 };
Event d = new Event { priority = 4, duration = 1.0, earliestTime = 4.0 };

Upvotes: 0

AShelly
AShelly

Reputation: 35520

If you have a limited set of priority levels, you could keep a set of time-sorted lists, 1 for each level. Whenever you need the next event, check the head of each list in priority order until you find one whose start time has passed. (Keep track of the minimum start time while you check - in case no event is ready yet, you know which one to wait for)

Upvotes: 0

jakber
jakber

Reputation: 3569

I think you should sort the list twice: first by priority and then by earliest time, using any stable sort algorithm, for instance insertion sort. That way the time will be increasing and for each time things will be sorted by priority.

Unless you see something I don't you can completely ignore the duration of each event for the purpose of the sort.

http://en.wikipedia.org/wiki/Category:Stable_sorts

Upvotes: 0

Konrad Rudolph
Konrad Rudolph

Reputation: 545528

In other words, you want to optimize the overall running time while formulating two constraints (strong: earliest point of execution, weak: priority)? This is called a constraint satisfaction problem. There are special solvers for this kind of problem.

Incidentally, jakber's solution doesn't work. Even without the duration, the following example obviously fails:

event a (priority = 1, start = 5)
event b (priority = 2, start = 0)

The sorted sequence would be a, b while the wanted outcome is surely b, a.

Upvotes: 2

Related Questions