Reputation: 45
I have a list of 10.000 elements of custom objects. The important thing is that these objects have a graph as an attribute (consisting of something between 20 and 500 nodes).
Now I would like to remove all the duplicates in the list, where I assume two objects to be equal if and only if their graphs are isomorphic.
My code looks something like that:
import networkx as nx
#
def remove_duplicates(list_):
filtered_list = list()
while len(list_) > 0:
A = list_.pop()
filtered_list.append(A)
for B in list_:
if A.num_of_nodes == B.num_of_nodes:
if nx.is_isomorphic(A.graph, B.graph, node_match=node_check):
list_.remove(B)
return filtered_list
However, the program stops progressing at a certain point. I checked the activity monitor and apparently the memory is not an issue, but maybe the CPU.
Does someone have a hint on how to solve this a little bit more efficiently/elegantly? For smaller samples, my code worked well.
Upvotes: 3
Views: 152
Reputation: 299
Does this help? I do think this method works from what I understand from your question:
import networkx as nx
def remove_duplicates(list_):
for i in Counter(list_).items():
if i[1] > 1:
for v in range(i[1]-1):
list_.remove(i[0])
return list_
Basically we used the Counter()
class from the Collections
module. Basically we count how many times the same value is repeated in the list and then remove depending upon the number of times it repeated.
In your case the code might look like this:
I hope this answers your question.
Upvotes: 1
Reputation: 810
This will work as soon as you'll find something in self.graph
that is identical for any other isomorphic graph, and put it in place of ???
(e.g. something like self.graph.some_isomorphic_characteristic
)
import networkx as nx
from typing import List
class custom_object:
def __init__(self, num_of_nodes, graph):
self.num_of_nodes = num_of_nodes
self.graph = graph
def __eq__(self, other):
return self.num_of_nodes == other.num_of_nodes and nx.is_isomorphic(self.graph, other.graph, node_match=node_check)
def __ne__(self, other):
return not self.__eq__(other)
def __hash__(self):
return hash((self.num_of_nodes, ???))
def remove_duplicates(list_: List[custom_object]) -> List[custom_object]:
return list(set(list_))
Otherwise you can use
import networkx as nx
from typing import List
class custom_object:
def __init__(self, num_of_nodes, graph):
self.num_of_nodes = num_of_nodes
self.graph = graph
def __eq__(self, other):
return self.num_of_nodes == other.num_of_nodes and nx.is_isomorphic(self.graph, other.graph, node_match=node_check)
def __ne__(self, other):
return not self.__eq__(other)
def remove_duplicates(list_: List[custom_object]) -> List[custom_object]:
filtered_list = []
for obj in list_:
add = True
for filtered_obj in filtered_list:
if obj == filtered_obj:
add = False
break
if add:
filtered_list.append(obj)
return filtered_list
Upvotes: 1