Reputation: 1103
so I have a list of tuple of the form (subject1,relationtype,sobject2)
, representing relational facts. I want to write a method that remove one of (subject1,relationtype,sobject2)
, (subject2,relationtype,sobject1)
if they both are in the list.
Here is what I tried:
def delete_symmetric_relations(A):
A = set(tuple(e) for e in A)
for (s,r,o) in A:
for (s1, r1, o1) in A:
if (s,r,o)==(o1,r1,s1) and (s,r,o) != (s1,r1,o1):
A.remove((s1,r1,o1))
return list(A)
print(delete_symmetric_relations(data))
I then get the error: RuntimeError: Set changed size during iteration
Example of how the method should work:
Say we have list [(1,in_same_numbersystem_as,3),(2,"is_smaller_than",4),(3,in_same_numbersystem_as,1),(2,"is_smaller_than",6)]
, the method should return one of [(2,"is_smaller_than",4),(3,in_same_numbersystem_as,1),(2,"is_smaller_than",6)]
or [(1,in_same_numbersystem_as,3),(2,"is_smaller_than",4),(2,"is_smaller_than",6)]
So from the suggestion, i rewrote the code as :
def delete_symmetric_relations(A):
somelist = [(s,r,o) for (s,r,o) in A if (o,r,s) not in A]
return somelist
But this code removes all (s,r,o) and (o,r,s) but I want to retain at least one.and got:
IOPub data rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--NotebookApp.iopub_data_rate_limit`
Because my list is very very large.
So how can i do it?
Upvotes: 0
Views: 66
Reputation: 1328
You can sort each of the tuple inside the list and pass the final output into the set will remove duplicates
>>> data = [(0,1,7), (5,1,3), (7,1,0), (0,7,1)] # sample input
>>> data = list(set(map(lambda x: tuple(sorted(x)), data)))
[(1, 3, 5), (0, 1, 7)]
Note: The above solution works only if your tuple
must have a unique type object
.
If your tuple contains a mix of different type
objects then you need to convert all the elements inside the tuple
into string
type and pass that into the sorted
method.
>>> data = [(0, 1, 7, 'b'), (5, 1, 3, 'a'), (7, 1, 0, 'b'), (0, 1, 7, 'b')]
>>> list(set(map(lambda x: tuple(sorted(map(str, x))), data)))
[('1', '3', '5', 'a'), ('0', '1', '7', 'b')]
Upvotes: 1
Reputation: 878
Update: I misunderstood the question originally. The basic concept still stands. Don't try to change a list you are looping over. Instead, make a copy to mutate. Then loop over the original list. You can make whatever comparison you need.
def remove_symetric(A):
B = A
for (a, b, c) in A:
if (c,b,a) in B:
B.remove((c,b,a))
return B
A = [(0, 1, 3), (0, 1, 3), (0, 2, 3), (0, 1, 4), (5, 1, 3), (0, 7, 3), (0, 7, 3),(3, 1, 0)]
A=remove_symetric(A)
print("Non-duplicate items:")
print(A)
Output:
Non-duplicate items:
[(0, 1, 3), (0, 1, 3), (0, 2, 3), (0, 1, 4), (5, 1, 3), (0, 7, 3), (0, 7, 3)]
Original answer:
Instead of removing duplicates. Try adding to a blank list if it's not added yet. Something like this:
def return_unique(A):
B = []
for x in A:
if x not in B:
B.append(x)
return B
Test like so:
A = [(0, 1, 3), (0, 1, 3), (0, 2, 3), (0, 1, 4), (5, 1, 3), (0, 7, 3), (0, 7, 3)]
B = return_unique(A)
print('Non-duplicate items:')
print(B)
Non-duplicate items:
[(0, 1, 3), (0, 2, 3), (0, 1, 4), (5, 1, 3), (0, 7, 3)]
Upvotes: 1