user42493
user42493

Reputation: 1103

how to remove specific elements in a set by iterating over the elements in the set?

so I have a list of tuple of the form (subject1,relationtype,sobject2), representing relational facts. I want to write a method that remove one of (subject1,relationtype,sobject2) , (subject2,relationtype,sobject1) if they both are in the list.

Here is what I tried:

def delete_symmetric_relations(A):
    A = set(tuple(e) for e in A)
    for (s,r,o) in A:
        for (s1, r1, o1) in A:
            if (s,r,o)==(o1,r1,s1) and (s,r,o) != (s1,r1,o1):
                A.remove((s1,r1,o1))
    return list(A)

print(delete_symmetric_relations(data)) 

I then get the error: RuntimeError: Set changed size during iteration

Example of how the method should work: Say we have list [(1,in_same_numbersystem_as,3),(2,"is_smaller_than",4),(3,in_same_numbersystem_as,1),(2,"is_smaller_than",6)], the method should return one of [(2,"is_smaller_than",4),(3,in_same_numbersystem_as,1),(2,"is_smaller_than",6)] or [(1,in_same_numbersystem_as,3),(2,"is_smaller_than",4),(2,"is_smaller_than",6)] So from the suggestion, i rewrote the code as :

def delete_symmetric_relations(A):
    somelist = [(s,r,o) for (s,r,o) in A if (o,r,s) not in A]
    return somelist

But this code removes all (s,r,o) and (o,r,s) but I want to retain at least one.and got:

IOPub data rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--NotebookApp.iopub_data_rate_limit`

Because my list is very very large.

So how can i do it?

Upvotes: 0

Views: 66

Answers (2)

mohammed wazeem
mohammed wazeem

Reputation: 1328

You can sort each of the tuple inside the list and pass the final output into the set will remove duplicates

>>> data = [(0,1,7), (5,1,3), (7,1,0), (0,7,1)]  # sample input

>>> data = list(set(map(lambda x: tuple(sorted(x)), data)))
[(1, 3, 5), (0, 1, 7)]

Note: The above solution works only if your tuple must have a unique type object. If your tuple contains a mix of different type objects then you need to convert all the elements inside the tuple into string type and pass that into the sorted method.

>>> data = [(0, 1, 7, 'b'), (5, 1, 3, 'a'), (7, 1, 0, 'b'), (0, 1, 7, 'b')]
>>> list(set(map(lambda x: tuple(sorted(map(str, x))), data)))
[('1', '3', '5', 'a'), ('0', '1', '7', 'b')]

Upvotes: 1

gnodab
gnodab

Reputation: 878

Update: I misunderstood the question originally. The basic concept still stands. Don't try to change a list you are looping over. Instead, make a copy to mutate. Then loop over the original list. You can make whatever comparison you need.

def remove_symetric(A):

    B = A
    for (a, b, c) in A:
        if (c,b,a) in B:
            B.remove((c,b,a))

    return B

A = [(0, 1, 3), (0, 1, 3), (0, 2, 3), (0, 1, 4), (5, 1, 3), (0, 7, 3), (0, 7, 3),(3, 1, 0)]
A=remove_symetric(A)
print("Non-duplicate items:")
print(A)

Output:

Non-duplicate items:
[(0, 1, 3), (0, 1, 3), (0, 2, 3), (0, 1, 4), (5, 1, 3), (0, 7, 3), (0, 7, 3)]

Original answer:

Instead of removing duplicates. Try adding to a blank list if it's not added yet. Something like this:

def return_unique(A):

    B = []
    for x in A:
       if x not in B:
           B.append(x)
    return B

Test like so:

A = [(0, 1, 3), (0, 1, 3), (0, 2, 3), (0, 1, 4), (5, 1, 3), (0, 7, 3), (0, 7, 3)]
B = return_unique(A)
print('Non-duplicate items:')
print(B)
Non-duplicate items:
[(0, 1, 3), (0, 2, 3), (0, 1, 4), (5, 1, 3), (0, 7, 3)]

Upvotes: 1

Related Questions