mikeLundquist
mikeLundquist

Reputation: 1009

How can I generate unique itertools chains?

For example, what's the itertools.chain() equivalent of:

set.union({1,2,3},{3,4,2,5},{1,6,2,7})

(obviously that returns a generator, rather than a set)

Upvotes: 2

Views: 4210

Answers (3)

remykarem
remykarem

Reputation: 2469

There are 3 ways you can do this:

  1. You can use unique_everseen from the more-itertools package here as recommended in the Python itertools recipe documentation.

  2. Also, if you scroll down the itertools recipe, you'll see Python's recipe for unique_everseen:

    def unique_everseen(iterable, key=None):
         "List unique elements, preserving order. Remember all elements ever seen."
        # unique_everseen('AAAABBBCCDAABBB') --> A B C D
        # unique_everseen('ABBCcAD', str.lower) --> A B C D
        seen = set()
        seen_add = seen.add
        if key is None:
            for element in filterfalse(seen.__contains__, iterable):
                seen_add(element)
                yield element
        else:
            for element in iterable:
                k = key(element)
                if k not in seen:
                    seen_add(k)
                    yield element
    
  3. Interestingly, you can also find this function in importlib_metadata._itertools.unique_everseen.

    >>> from importlib_metadata._itertools import unique_everseen
    >>> list(unique_everseen('AAAABBBCCDAABBB'))
    ['A', 'B', 'C', 'D']
    

    However, I don't think it's meant to be used by us (because of the way they hide this function from us like that).

Upvotes: 1

acushner
acushner

Reputation: 9946

you can do something like this:

def chain_unique(*args):
    seen = set()
    yield from (v for v in chain(*args) if v not in seen and not seen.add(v))

Upvotes: 0

schesis
schesis

Reputation: 59148

There isn't anything in itertools which will do this for you directly.

In order to avoid yielding duplicate items, you'll need to keep track of what you've already yielded, and the obvious way to do so is with a set. Here's a simple wrapper around itertools.chain() which does that:

from itertools import chain

def uniq_chain(*args, **kwargs):
    seen = set()
    for x in chain(*args, **kwargs):
        if x in seen:
            continue
        seen.add(x)
        yield x

... and here it is in action:

>>> list(uniq_chain(range(0, 20, 5), range(0, 20, 3), range(0, 20, 2)))
[0, 5, 10, 15, 3, 6, 9, 12, 18, 2, 4, 8, 14, 16]

Alternatively, if you prefer to compose a solution from smaller building blocks (which is a more flexible and "itertoolsy" way to do it), you could write a general purpose uniq() function and combine it with chain():

def uniq(iterable):
    seen = set()
    for x in iterable:
        if x in seen:
            continue
        seen.add(x)
        yield x

In action:

>>> list(uniq(chain(range(0, 20, 5), range(0, 20, 3), range(0, 20, 2))))
[0, 5, 10, 15, 3, 6, 9, 12, 18, 2, 4, 8, 14, 16]

Upvotes: 8

Related Questions