Reputation: 131

Fuzzy sorting of intervals using quicksort

Problem 7-6 of Introduction to Algorithms asks the following:

Consider a sorting problem in which we do not know the numbers exactly. Instead, for each number, we know an interval on the real line to which it belongs. That is, we are given n closed intervals of the form [a_i, b_i], where a_i <= b_i. We wish to fuzzy-sort these intervals. (Cormen, Leiserson, Rivest, & Stein, 2009, p. 189)

Demaine and Goldwasser (2004) clarify that "no interval contains any other interval," or that "if a_i <= a_j, then b_i, b_j."

I have implemented the pseudocode from Lanman (2006). Although I think I am very close, the functions do not return a correct result on my test input. My code follows:

def sort_fuzzy(intervals, p, s):
    """
    Given a list whose elements are 2-tuples that represent inclusive intervals,
    sort it in place by partitioning it.  Assume no interval completely contains
    any other interval.
    :param  list intervals: An unsorted list of 2-tuples that represent a \
                            closed interval in which a fuzzy value lies.
    :param  int p:          Starting index of the region to sort.
    :param  int s:          Ending index of the region to sort.
    """

    if p < s:
        x = find_intersection(intervals, p, s)
        r = partition_fuzzy_right(intervals, p, s, x)
        q = partition_fuzzy_left_middle(intervals, p, r, x)
        sort_fuzzy(intervals, p, q - 1)
        sort_fuzzy(intervals, r + 1, s)

def partition_fuzzy_right(intervals, p, s, x):
    """
    Given a list whose elements are 2-tuples that represent inclusive intervals,
    partition it into three regions: p to r - 1, r, and r + 1 to s.
    :param  list intervals: An unsorted list of 2-tuples that represent a \
                            closed interval in which a fuzzy value lies.
    :param  int p:          Starting index of the region to sort.
    :param  int s:          Ending index of the region to sort.
    :param  tuple x:        Intersection of intervals.
    """

    i = p - 1
    for j in range(p, s):
        if intervals[j][0] <= x[0]:
            i += 1
            intervals[i], intervals[j] = intervals[j], intervals[i]

    intervals[i + 1], intervals[s] = intervals[s], intervals[i + 1]

    return i + 1

def partition_fuzzy_left_middle(intervals, p, r, x):
    """
    Given a list whose elements are 2-tuples that represent inclusive intervals,
    partition it into four regions: p to q - 1. q, q + 1 to r, and r + 1 to s.
    :param  list intervals: An unsorted list of 2-tuples that represent a \
                            closed interval in which a fuzzy value lies.
    :param  int p:          Starting index of the region to sort.
    :param  int s:          Ending index of the region to sort.
    :param  tuple x:        Intersection of intervals.
    """

    i = p - 1
    for j in range(p, r):
        if intervals[j][1] < x[1]:
            i += 1
            intervals[i], intervals[j] = intervals[j], intervals[i]

    intervals[i + 1], intervals[r] = intervals[r], intervals[i + 1]

    return i + 1

def find_intersection(intervals, p, s):
    """
    Given a list whose elements are 2-tuples that represent inclusive intervals,
    return the intersection of a pivot interval and the 2-tuples if one exists.
    Otherwise, just return the pivot interval.
    :param  list intervals: An unsorted list of 2-tuples that represent a \
                            closed interval in which a fuzzy value lies.
    :param  int p:          Starting index of the region to sort.
    :param  int s:          Ending index of the region to sort.
    """

    x = intervals[s]

    for i in range(p, s):
        if intervals[i][0] <= x[1] and intervals[i][1] >= x[0]:
            if intervals[i][0] > x[0]:
                x = (intervals[i][0], x[1])
            if intervals[i][1] < x[1]:
                x = (x[0], intervals[i][1])

    return x

if __name__ == '__main__':
    list = [(13, 20), (19, 21), (9, 11), (5, 7), (12, 16), (8, 10), (7, 9), (4, 6), (20, 24), (2, 2), (6, 8), (11, 15)]
    print(list)
    sort_fuzzy(list, 0, len(list) - 1)
    print(list)

Any helps and hints would be very much appreciated. I've been working on this for days.

UPDATE: A more straightforward implementation of the pseudocode in Lanman (2006), that is, splitting the input array of tuples into an A and a B array and adapting from there, did not help. I got the same result.

References:

Cormen, T. H., Leiserson, C. E., Rivest, R. L., & Stein, C. (2009). Introduction to Algorithms (3rd ed.) [ProQuest Ebook Central version]. Retrieved from https://ebookcentral.proquest.com

Demaine, E., & Goldwasser, S. (2004, February 24). Problem Set 2 [Class handout]. Retrieved from https://courses.csail.mit.edu/6.046/spring04/handouts/ps2-sol.pdf

Lanman, D. R. (2006, March 13). CS 157: Assignment 3 [Class handout]. Retrieved from http://alumni.media.mit.edu/~dlanman/courses/cs157/HW3.pdf

Upvotes: 3

Answers (3)

Kingsley Oyelabi

Reputation: 85

I know it's an old question, but here's my solution. It's written in C too, but it should convey the idea well.

// struct representing an interval
typedef struct interval interval;
struct interval{
    double first;
    double second;
};

// fuzzy-sort the interval [start, end)
void fuzzy_sort(size_t start, size_t end, interval arr[])
{
    size_t x = start;       // head of the < partition + 1
    size_t y = start + 1;   // head of the = partition + 1
    size_t z = end;         // bottom of the > partition

    if(y >= z)
        return;
    // common overlap interval of equal
    // elements starting
    // with arr[start] as pivot
    double p1 = arr[start].first, p2 = arr[start].second;

    while(y < z){
        if(arr[y].second < p1){
            interval hold = arr[y];
            arr[y] = arr[x];
            arr[x] = hold;
            ++x, ++y;
        }
        else if(arr[y].first > p2){
            interval hold = arr[y];
            arr[y ] = arr[z - 1];
            arr[z - 1] = hold;
            --z;
        }
        else{
            // max
            p1 = p1 < arr[y].first ? arr[y].first : p1;
            // min
            p2 = p2 > arr[y].second ? arr[y].second : p2;
            ++y;
        }
    }
    fuzzy_sort(start, x, arr);
    fuzzy_sort(z, end, arr);
}

// return 1 if arr is fuzzy-sorted, else 0
int is_fuzzy_sorted(size_t n, interval arr[])
{
    double last = arr[0].first;
    for(int i = 1; i < n; ++i){
        if(last > arr[i].second)
            return 0;
        last = last < arr[i].first ? arr[i].first : last;
    }
    return 1;
        
}

Mid-execution, the array might look somthing like this

[-less-->|-equals-->|<----unpartitioned---->|<---greater--]

The idea is to partition the array into three regions: those with a mutual overlap interval along with the pivot, those completely less than this interval, and those completely greater than it. The ones with a mutual overlap interval(middle partition) may appear in any order, so the algorithm recursively sorts only the left and right partitions. Also, if there's a common interval, the algorithm makes exactly one pass through the array, as stated in the original question in Cormen et al. (2022).

When run with your input, the code outputs: {(2, 2), (4, 6), (7, 9), (5, 7), (6, 8), (8, 10), (11, 15), (9, 11), (12, 16), (13, 20), (19, 21), (20, 24)} which also is a correct answer.

I only just developed it this morning, and can roughly explain why it works, but no mathematical rigour. Please, feel free to criticize.

Upvotes: 0

michaelgill1969

Reputation: 131

As @tobias_k pointed out, my problem was not understanding the question or what a correct solution looks like. Regarding a correct solution, Cormen et al. (2009) stated, "We wish to fuzzy-sort these intervals, i.e., to produce a permutation of the intervals such that for j = 1, 2, ..., n, there exist c_j ∈ [a_i_j, b_i_j] satisfying c_1 <= c_2 <= ... <= c_n."

So, for the input [(13, 20), (19, 21), (9, 11), (5, 7), (12, 16), (8, 10), (7, 9), (4, 6), (20, 24), (2, 2), (6, 8), (11, 15)], my code outputs [(2, 2), (4, 6), (6, 8), (7, 9), (5, 7), (8, 10), (9, 11), (11, 15), (13, 20), (12, 16), (19, 21), (20, 24)], which is a correct solution.

You see, just as Cormen et al. (2009) wrote, if any number in an interval is greater than or equal to any number in an interval preceding it, it may correctly follow that preceding interval. In other words, consider the following:

2 | 2 ∈ [2, 2] <= 4 | 4 ∈ [4, 6] <= 6 | 6 ∈ [6, 8] <= 7 | 7 ∈ [7, 9] <= 7 | 7 ∈ [5, 7] 8 | 8 ∈ <= [8, 10] <= 9 | 9 ∈ [9, 11] <= 11 | 11 ∈ [11, 15] <= 13 | 13 ∈ [13, 20] <= 13 | 13 ∈ [12, 16] <= 19 | 19 ∈ [19, 21] <= 20 | 20 ∈ [20, 24]

It is not necessary that the left edges be sorted in increasing order, but only that the intervals overlap in some fuzzily increasing order. See page four of Lanman (2006) and really understand what a correct fuzzy-sort is before solving the problem.

References:

Cormen, T. H., Leiserson, C. E., Rivest, R. L., & Stein, C. (2009). Introduction to Algorithms (3rd ed.) [ProQuest Ebook Central version]. Retrieved from https://ebookcentral.proquest.com

Lanman, D. R. (2006, March 13). CS 157: Assignment 3 [Class handout]. Retrieved from http://alumni.media.mit.edu/~dlanman/courses/cs157/HW3.pdf

Upvotes: 1

ElKamina

Reputation: 7817

Since the intervals don't contain each other, you could either sort by left edge, middle or right edge and you will get the same exact answer.

If you want to make the sorting fuzzy, you could sample the interval (assuming uniform or whatever distribution you want and sort on the sampled values.

Upvotes: 0

Fuzzy sorting of intervals using quicksort

Answers (3)

Related Questions