Reputation: 121
Given a list of frozensets of integers, how can I find the first element which appears above some certain frequency?
There is a condition in an algorithm I am implementing involving hypergraph dualization.
Given two hypergraphs G (implemented as a tuple (G0,G1)) where G0 is the set of vertices and G1 is a list of frozensets(hyperedges) of vertices and similarly for H (it is important to note that G0 = H0 always).
The pseudocode of the algorithm says to find, within the G1 and H1, any x (which is guaranteed to be found because of some theory) in either G1 OR H1 (presumably the fastest one you can find) such that the frequency of x is >= 1/log(|G1|+|H1|).
I did it naively as:
def logOrMore(v,edges,sum):
count = 0
for edge in edges:
if v in edge:
count+=1
if count >= (1/log(sum)):
return True
return False
and in the main algorithm:
...code...
sum = len(G[1])+len(H[1])
x = 0
for v in G[0]:
if logOrMore(v,G[1],sum):
x = v
break
if x ==0:
for v in G[0]:
if logOrMore(v,H[1],sum):
x = v
break
...more code...
This is becoming a big problem when the hypergraphs are huge. How can I do this in the fastest way possible?
An example of a G1 is
G1 = [frozenset({74, 76}), frozenset({73, 74, 29, 30}), frozenset({73, 74, 3, 4}), frozenset({74, 76, 29, 30}), frozenset({16, 73, 74}), frozenset({73, 74})]
but this is a very small case. It can get up to the point where there are more than 1000 frozensets inside the list.
Note: The frozensets of integers are not sorted most of the time
This is in Python 3.6.4 if that helps
Upvotes: 1
Views: 74
Reputation: 490
So as suggested in the comments by @martineau, you can perform the threshold calculation only once at the beginning and store as a variable.
Another possible way of cleaning this up is using package called first
. It's designed to take an iterable and find the first value which matches a condition. In your case, it sounds like you're looking for a value >= your threshold.
pip install first
An example from the documentation, finding the first even number in a list:
from first import first
first([1, 1, 3, 4, 5], key=lambda x: x % 2 == 0)
You could use this to iterate your frozensets to find your condition. However, the rest of your codebase notwithstanding, you may be able to zip()
your G and H lists together and use only one for-loop to check them.
Upvotes: 1