DY92
DY92

Reputation: 437

Unstable behaviour of counting inversions algorithm(python)

First of all, inversions are pairs of numbers, in a disordered list, where the larger of the two numbers is to the left of smaller number. In the following list: [ 1, 3, 5, 2, 4, 6 ] there are 3 inversions: (3,2), (5,2), and (5,4)

Here is my code of counting inversions:

def num_inv(array):
    count = count_left = count_right = 0
    if len(array)<=1:
        return 0

    mid = len(array)//2
    left_array = array[:mid]
    right_array = array[mid:]

    count_left = num_inv(array[:mid])
    count_right = num_inv(array[mid:])


    i = j =  k = 0

    while i < len(left_array) and j < len(right_array):
        if left_array[i] <= right_array[j]:
            array[k] = left_array[i]
            i+=1
        else:
            array[k] = right_array[i]
            j +=1
            count += len(left_array[i:])
        k+=1   


    while i <len(left_array):
        array[k] = left_array[i]
        i+=1
        k+=1


    while j <len(right_array):
        array[k] = right_array[j]
        j+=1
        k+=1


    return count + count_left + count_right

In my test cases(most of them) it gives a correct result, but sometimes it's wrong. Can you please have a look and tell me what is wrong with the code? I spent descent amount of time on debugging, so that I need a fresh look at my code.

Test cases: t4 outputs incorrect result

t1 = [1,3,5,2,4,6]
print("Testing using", t1)
print("Expecting:", 3)
print("Returned:", num_inv(t1))

t2 = [1,5,3,2,4]
print("\nTesting using", t2)
print("Expecting:", 4)
print("Returned:", num_inv(t2))

t3 = [5,4,3,2,1]
print("\nTesting using", t3)
print("Expecting:", 10)
print("Returned:", num_inv(t3))

t4 = [1,6,3,2,4,5]
print("\nTesting using", t4)
print("Expecting:", 5)
print("Returned:", num_inv(t4))

t5 = [1,2,3,4,5,6]
print("\nTesting using", t5)
print("Expecting:", 0)
print("Returned:", num_inv(t5))

Upvotes: 0

Views: 76

Answers (2)

alexis
alexis

Reputation: 50220

To start with, you compare left_array[i] <= right_array[j] in your first while loop, but then you copy from right_array[i]. This has got to be a mistake, but it's not what throws off the counts.

After this correction I added one line to your code, the diagnostic print() statement in the first while loop:

while i < len(left_array) and j < len(right_array):
    if left_array[i] <= right_array[j]:
        array[k] = left_array[i]
        i+=1
    else:
        array[k] = right_array[j]
        j +=1
        print("Counting as inversions of", array[k], ":", left_array[i:])
        count += len(left_array[i:])
    k+=1   

I then ran your test t4, and got the following output:

Testing using [1, 6, 3, 2, 4, 5]
Expecting: 5
Counting as inversions of 3 : [6]
Counting as inversions of 2 : [6, 3]
Counting as inversions of 4 : [6, 3]
Counting as inversions of 5 : [6, 3]
Returned: 7

Get it? When you find one inverted pair, it is incorrect to assume (as your code does) that the rest of left_array consists of inversions (that is, of numbers larger than array[k]). Here, 4 and 5 are not inverted with respect to 3. You'll have to count differently.

Upvotes: 2

AKX
AKX

Reputation: 169184

Something like this seems to work.

(It generates the actual inversions and their indices, not just the count.)

def find_inversions(arr):
    for i, ival in enumerate(arr):
        for j in range(i + 1, len(arr)):
            jval = arr[j]
            if jval < ival:
                yield (i, j), (ival, jval)


for case, expected in (
    ([1, 3, 5, 2, 4, 6], 3),
    ([1, 5, 3, 2, 4], 4),
    ([5, 4, 3, 2, 1], 10),
    ([1, 6, 3, 2, 4, 5], 5),
    ([1, 2, 3, 4, 5, 6], 0),
):
    n = len(list(find_inversions(case)))
    print(case, n, expected)

outputs

[1, 3, 5, 2, 4, 6] 3 3
[1, 5, 3, 2, 4] 4 4
[5, 4, 3, 2, 1] 10 10
[1, 6, 3, 2, 4, 5] 5 5
[1, 2, 3, 4, 5, 6] 0 0

EDIT: Here's another approach that isn't O(N^2). It is about 66% faster for e.g. range(100), and 2x faster for range(500).

def find_inversions_2(arr):
    seen_indices = defaultdict(set)
    for index, value in enumerate(arr):
        seen_indices[value].add(index)
        for seen_val in seen_indices:
            if seen_val > value:
                for seen_index in seen_indices[seen_val]:
                    yield (seen_index, index), (seen_val, value)

Upvotes: 0

Related Questions