Meni
Meni

Reputation: 1129

Fastest way to count list values below a threshold

Is there list method equivalent to numpy.count_nonzero(lst < t)? when I use it on list (lst < t) just returns True instead of list of booleans. I want to count list values below some threshold what is better - converting to numpy-array, using sort, some kind of list/generator comprehension or something else?

Upvotes: 2

Views: 3712

Answers (3)

Erwin Mayer
Erwin Mayer

Reputation: 18670

You can use the cardinality package for this:

Usage:

>>> import cardinality
>>> cardinality.count(i for i in range(500) if i > 499)
1

The actual count() implementation is as follows:

def count(iterable):
    if hasattr(iterable, '__len__'):
        return len(iterable)

    d = collections.deque(enumerate(iterable, 1), maxlen=1)
    return d[0][0] if d else 0

Upvotes: 0

shx2
shx2

Reputation: 64298

Sorting is not recommended as it is O(N*logN), where all other soloutions are simply O(N).

You can use a generator expression and a generator-len function, like this:

n = iterlen( x for x in lst if x < t )

This is better than list-comprehension becuase you don't need to construct the temporary list (of which you take the len), which takes up both time and memory.

Depending on the details of the problem (list size, element type), converting to a numpy array might prove faster. You should time both approaches, and see which one works best in your case.

Of course, the best solution, if possible, would be to represent the list as a numpy array to begin with. If you do that, numpy.count_nonzero(lst < t) is almost certain to be fastest.

Or, if you can build a sorted list to begin with, you can easily implement a count_less function using bisect. The complexity here is O(logN), which would be the fastest for large lists.

Upvotes: 5

Tom
Tom

Reputation: 1326

c will be the count of all items in a list (lst) that are below value t:

c = len([i for i in lst if i < t])

Upvotes: 2

Related Questions