Reputation: 161
In python, given a list mylist
of lists el
of integers, I would like to remove duplicates that are equivalent under specific permutations of the indices.
The question is more general but I have in mind "decorated" McKay graphs where each node is given an integer defining el
whose sum is equal to a certain number. Generating mylist
is easy enough, but there is a lot of redundancy depending on the symmetry of the graph. For instance, for the "D4" diagram, el
represents the following decorated graph:
a
|
el = [a, b, c, d, e] <--> b-c―d
|
e
The two lists [1,0,4,0,0]
and [0,0,4,0,1]
are duplicates because there are the same up to a reflection of the graph around the horizontal axis. More generally, the lists el
duplicates if there are equivalent under any permutation of a,b,d,e
. Is there a simple way of removing duplicates given an arbitrary symmetry?
The way I did it for this particular graph was to sort the indices 0,1,3,4
and compare:
from itertools import product
def check_duplicate(f, li):
for l in li:
if l[2] == f[2] and sorted(l[:2] + l[3:]) == sorted(f[:2] + f[3:]):
return True
return False
mylist = [list(i) for i in product(range(5), repeat=5) if sum(i) == 5]
newlist = []
for f in my_list:
if check_duplicate(f, newlist) == False:
newlist.append(f)
Here tmp
is generated brute force, but the condition is more involved in my real case.
This works well enough for this particular example, but is a bit clunky, and the implementation is harder to generalize to more involved cases. Is there a way to remove the duplicates in a more optimized way, in particular one that can easily implement removing duplicates given a particular symmetry of the indices of el
?
Upvotes: 1
Views: 40
Reputation: 101
It will probably be better to add "seen" cases to a set, and then continue to check if new elements have already been "seen", rather than comparing every element to every other element in the list iteratively (an operation with O(n^2) time complexity). Hopefully I understand what you're trying to accomplish, but here's how I'd do it:
from itertools import product
def to_key(f):
# Creates a hashable key, keeping order-insensitive values sorted, and noting the middle value"""
return (f[2], tuple(sorted(f[:2] + f[3:])))
mylist = [list(i) for i in product(range(5), repeat=5) if sum(i) == 5]
# Init an empty set. We'll add to it if a key isn't there
seen = set()
# If we add to the seen set, we'll add the element to newList
newlist = []
for f in mylist:
key = to_key(f) # Create the hashable key
if key not in seen: # Check if key in seen
seen.add(key)
newlist.append(f)
Upvotes: 0