Hushus46
Hushus46

Reputation: 121

Find first element in list above certain frequency

Given a list of frozensets of integers, how can I find the first element which appears above some certain frequency?

There is a condition in an algorithm I am implementing involving hypergraph dualization.

Given two hypergraphs G (implemented as a tuple (G0,G1)) where G0 is the set of vertices and G1 is a list of frozensets(hyperedges) of vertices and similarly for H (it is important to note that G0 = H0 always).

The pseudocode of the algorithm says to find, within the G1 and H1, any x (which is guaranteed to be found because of some theory) in either G1 OR H1 (presumably the fastest one you can find) such that the frequency of x is >= 1/log(|G1|+|H1|).

I did it naively as:

def logOrMore(v,edges,sum):
    count = 0
    for edge in edges:
        if v in edge:
            count+=1
    if count >= (1/log(sum)):
        return True
    return False

and in the main algorithm:

...code...
sum = len(G[1])+len(H[1])
x = 0
for v in G[0]:
    if logOrMore(v,G[1],sum):
        x = v
        break
if x ==0:
     for v in G[0]:
        if logOrMore(v,H[1],sum):
            x = v
            break
...more code...

This is becoming a big problem when the hypergraphs are huge. How can I do this in the fastest way possible?

An example of a G1 is

G1 = [frozenset({74, 76}), frozenset({73, 74, 29, 30}), frozenset({73, 74, 3, 4}), frozenset({74, 76, 29, 30}), frozenset({16, 73, 74}), frozenset({73, 74})]

but this is a very small case. It can get up to the point where there are more than 1000 frozensets inside the list.

Note: The frozensets of integers are not sorted most of the time

This is in Python 3.6.4 if that helps

Upvotes: 1

Views: 74

Answers (1)

smallpants
smallpants

Reputation: 490

So as suggested in the comments by @martineau, you can perform the threshold calculation only once at the beginning and store as a variable.

Another possible way of cleaning this up is using package called first. It's designed to take an iterable and find the first value which matches a condition. In your case, it sounds like you're looking for a value >= your threshold.

pip install first

An example from the documentation, finding the first even number in a list:

from first import first
first([1, 1, 3, 4, 5], key=lambda x: x % 2 == 0)

You could use this to iterate your frozensets to find your condition. However, the rest of your codebase notwithstanding, you may be able to zip() your G and H lists together and use only one for-loop to check them.

Upvotes: 1

Related Questions