Reputation: 3768

Checking for duplicates in list of list and sorting them

I have a table containing:

table = [[5, 7],[4, 3],[3, 3],[2, 3],[1, 3]]

and the first values represented in each list, (5,4,3,2,1) can be said to be an ID of a person. the second values represented (7,3,3,3,3) would be a score. What I'm trying to do is to detect duplicates values in the second column which is in this case is the 3s in the list. Because the 4 lists has 3 as the second value, i now want to sort them based on the first value.

In the table, notice that [1,3] has one as the first value hence, it should replace [4,3] position in the table. [2,3] should replace [3,3] in return.

Expected output: [[5,7],[1,3],[2,3],[3,3],[4,3]]

I attempted:

def checkDuplicate(arr):
    i = 0
    while (i<len(arr)-1):
        if arr[i][1] == arr[i+1][1] and arr[i][0] > arr[i+1][0]:
             arr[i],arr[i+1] = arr[i+1],arr[i]
        i+=1
    return arr

checkDuplicate(table)

The code doesn't fulfil the output i wanted and i would appreciate some help on this matter.

Upvotes: 3

Answers (3)

FatihAkici

Reputation: 5109

If you want to leave the list items with non-duplicate second elements untouched, and the ability to deal with the cases where multiple second items can be duplicate, I think you'll need more than the built-in sort.

What my function achieves:

Say your list is: table = [[5, 7], [6, 1], [8, 9], [3, 1], [4, 3], [3, 3], [2, 3], [1, 3]]

It will not touch the items [5, 7] and [8, 9], but will sort the remaining items by swapping them based on their second elements. The result will be:

[[5, 7], [3, 1], [8, 9], [6, 1], [1, 3], [2, 3], [3, 3], [4, 3]]

Here is the code:

def secondItemSort(table):
    # First get your second values
    secondVals = [e[1] for e in table]
    # The second values that are duplicate
    dups = [k for k,v in Counter(secondVals).items() if v>1]
    # The indices of those duplicate second values
    indices = dict()
    for d in dups:
        for i, e in enumerate(table):
            if e[1]==d:
                indices.setdefault(d, []).append(i)
    # Now do the sort by swapping the items intelligently
    for dupVal, indexList in indices.items():
        sortedItems = sorted([table[i] for i in indexList])
        c = 0
        for i in range(len(table)):
            if table[i][1] == dupVal:
                table[i] = sortedItems[c]
                c += 1
    # And return the intelligently sorted list
    return table

Test

Let's test on a little bit more complicated table:

table = [[5, 7], [6, 1], [8, 9], [3, 1], [4, 3], [3, 9], [3, 3], [2, 2], [2, 3], [1, 3]]

Items that should stay in their places: [5, 7] and [2, 2].

Items that should be swapped:

[6, 1] and [3, 1].
[8, 9] and [3, 9]
[4, 3], [3, 3], [2, 3], [1, 3]

Drumroll...

In [127]: secondItemSort(table)
Out[127]: 
[[5, 7],
 [3, 1],
 [3, 9],
 [6, 1],
 [1, 3],
 [8, 9],
 [2, 3],
 [2, 2],
 [3, 3],
 [4, 3]]

Upvotes: 0

Olivier Melançon

Reputation: 22294

You can use sorted with a key.

table = [[5, 7], [4, 3], [3, 3], [2, 3], [1, 3]]

# Sorts by second index in decreasing order and then by first index in increasing order
sorted_table = sorted(table, key=lambda x: (-x[1], x[0]))

# sorted_table: [[5, 7], [1, 3], [2, 3], [3, 3], [4, 3]]

Upvotes: 5

Mad Physicist

Reputation: 114240

You should sort the entire list by the second column, using the first to break ties. This has the advantage of correctly grouping the threes even when the seven is interpersed among them, e.g. something like

table = [[4, 3],[3, 3],[5, 7],[2, 3],[1, 3]]

In Python, you can do it with a one-liner:

result = sorted(table, key=lambda x: (-x[1], x[0]))

If you want an in-place sort, do

table.sort(key=lambda x: (-x[1], x[0]))

Another neat thing you can do in this situation is to rely on the stability of Python's sorting algorithm. The docs actually suggest doing multiple sorts in complex cases like this, in the reverse order of the keys. Using the functions from operator supposedly speeds up the code as well:

from opetator import itemgetter

result = sorted(table, key=itemgetter(0))
result.sort(key=itemgetter(1), reversed=True)

The first sort will arrange the IDs in the correct order. The second will sort by score, in descending order, leaving the IDs undisturbed for identical scores since the sort is stable.

Upvotes: 1

Checking for duplicates in list of list and sorting them

Answers (3)

What my function achieves:

Test

Related Questions