Arkistarvh Kltzuonstev
Arkistarvh Kltzuonstev

Reputation: 6920

Removing list items from list of tuples based on second item of tuple

I've a list of tuples as :

ap = [('unknown', (81, 717, 236, 562)), ('unknown', (558, 1033, 825, 765)), ('unknown', (96, 1142, 225, 1013)), ('Jenny', (558, 1033, 825, 765)), ('unknown', (477, 1233, 632, 1078)), ('unknown', (741, 1199, 868, 1070)), ('Garry', (53, 282, 182, 153)), ('Sam', (477, 1233, 632, 1078)), ('Chen', (593, 283, 779, 97)), ('Steve', (741, 1199, 868, 1070)), ('unknown', (53, 282, 182, 153)), ('Harry', (81, 717, 236, 562)), ('unknown', (593, 283, 779, 97))]

I want to sort it like if the second item of tuple is same as any other tuple then, keep the tuple with first item not as "unknown" and delete the tuple with first item as "unknown". The output should be like :

ap = [('Harry',(81, 717, 236, 562)), ('Jenny', (558, 1033, 825, 765)), ('unknown', (96, 1142, 225, 1013)), ('Sam', (477, 1233, 632, 1078)), ('Steve', (741, 1199, 868, 1070)), ('Garry', (53, 282, 182, 153)), ('Chen', (593, 283, 779, 97))]

I tried this code :

for i in ap:
    for j in ap:
        if i[1] == j[1]:
            if i[0] == "unknown":
                del i
            else:
                del j

But it gives this error :

Traceback (most recent call last):
  File "<stdin>", line 3, in <module>
NameError: name 'i' is not defined

What's wrong in it?

Upvotes: 1

Views: 1471

Answers (4)

PL200
PL200

Reputation: 741

Short and simple answer here: list comprehension

ap = [    ('unknown', (81, 717, 236, 562)), 
          ('unknown', (558, 1033, 825, 765)), 
          ('unknown', (96, 1142, 225, 1013)), 
          ('Jenny', (558, 1033, 825, 765)), 
          ('unknown', (477, 1233, 632, 1078)), 
          ('unknown', (741, 1199, 868, 1070)), 
          ('Garry', (53, 282, 182, 153)), 
          ('Sam', (477, 1233, 632, 1078)), 
          ('Chen', (593, 283, 779, 97)), 
          ('Steve', (741, 1199, 868, 1070)), 
          ('unknown', (53, 282, 182, 153)), 
          ('Harry', (81, 717, 236, 562)), 
          ('unknown', (593, 283, 779, 97))]

known = [my_tuple[1] for my_tuple in ap if my_tuple[0] != "unknown"]
output = [my_tuple for my_tuple in ap if (my_tuple[1] in known and my_tuple[0] != "unknown") or my_tuple[1] not in known]

print(output)

And then the output is:

[('unknown', (96, 1142, 225, 1013)), ('Jenny', (558, 1033, 825, 765)), ('Garry', (53, 282, 182, 153)), ('Sam', (477, 1233, 632, 1078)), ('Chen', (593, 283, 779, 97)), ('Steve', (741, 1199, 868, 1070)), ('Harry', (81, 717, 236, 562))]

What's happening here is that we're gathering all of the 2nd tuple elements into a list where the name isn't "unknown" (using list comprehension).

Then we're using list comprehension again to firstly add all the tuples where the 2nd element is known and the name isn't "unknown", and then after that we're adding any genuine unknowns.

That may sound confusing, hopefully you understand what I mean. Let me know if you have any questions.

Upvotes: 3

toti08
toti08

Reputation: 2454

Another approach could be to use collections to create a list of the duplicates in your original list and then create a new list of tuples by checking which element is duplicate. Then you create a new list checking which element is either not duplicate or if it is which one is not unknown:

import collections

# Create a list of elements that are duplicate in the original list
duplicates = [item for item, count in collections.Counter([x[1] for x in ap]).items() if count > 1]


new = []
for elem in ap:
    if elem[1] in duplicates:
        if elem[0] != 'unknown':
            # Copy the duplicate element only if it's not unknown
            new.append(elem)
    else:

        new.append(elem)
print 'New list: ',new

Output is:

new list:  [('unknown', (96, 1142, 225, 1013)), ('Jenny', (558, 1033, 825, 765)), ('Garry', (53, 282, 182, 153)), ('Sam', (477, 1233, 632, 1078)), ('Chen', (593, 283, 779, 97)), ('Steve', (741, 1199, 868, 1070)), ('Harry', (81, 717, 236, 562))]

Upvotes: 0

Tanmay jain
Tanmay jain

Reputation: 814

del statement

Deletion of a name removes the binding of that name from the local or global namespace, depending on whether the name occurs in a global statement in the same code block. If the name is unbound, a NameError exception will be raised.

Its better to use dictionary for this task.

expected = [('Harry',(81, 717, 236, 562)), ('Jenny', (558, 1033, 825, 765)),
('unknown', (96, 1142, 225, 1013)), ('Sam', (477, 1233, 632, 1078)),
('Steve', (741, 1199, 868, 1070)), ('Garry', (53, 282, 182, 153)), ('Chen', (593, 283, 779, 97))]


person_dict = {}

for person_name, person_val in ap:
    
    if person_val not in person_dict:
        # create key using tuple item 2
        person_dict[ person_val] =  person_name
    
    # key already exist so we only want to update its value if it is still unknown
    elif person_dict[ person_val] == 'unknown':
        person_dict[ person_val] =  person_name
        
        
ap = [(v,k) for k, v in person_dict.items()]

print(ap == expected) # True

Upvotes: 2

SamGhatak
SamGhatak

Reputation: 1493

Using same collection in two loops and deleting from the collection inside the loop is creating the issue here. I think it is better to create a new list of items that should be removed and delete them after this loop.

toDelete = []
for i in ap:
    for j in ap:
        if i[1] == j[1] and not ap.index(i) == ap.index(j):
            if i[0] == "unknown":
                toDelete.append(i)
            else:
                toDelete.append(j)

for i in toDelete:
    try:
        ap.remove(i)
    except Exception as e:
        pass

The try-catch is there as each element to be updated appears twice in the toDelete.

This can be avoided by taking the second loop as:

for j in ap[ap.index(i)+1:]:

Upvotes: 0

Related Questions