Homer Xing
Homer Xing

Reputation: 2861

How to join two generators (or other iterables) in Python?

I want to change the following code

for directory, dirs, files in os.walk(directory_1):
    do_something()

for directory, dirs, files in os.walk(directory_2):
    do_something()

to this code:

for directory, dirs, files in os.walk(directory_1) + os.walk(directory_2):
    do_something()

I get the error:

unsupported operand type(s) for +: 'generator' and 'generator'

How to join two generators in Python?

Upvotes: 286

Views: 159479

Answers (15)

Alexey
Alexey

Reputation: 4071

Here it is using a generator expression with nested fors:

range_a = range(3)
range_b = range(5)
result = ( item
           for one_range in (range_a, range_b)
           for item in one_range )
assert list(result) == [0, 1, 2, 0, 1, 2, 3, 4]

The for ... in ... are evaluated left-to-right. The identifier after for establishes a new variable. While one_range in used in the following for ... in ..., the item from the second one is used in the „final” assignment expression of which there is only one (in the very beginning).

Related question: How do I make a flat list out of a list of lists?.

Upvotes: 15

manjunath kallannavar
manjunath kallannavar

Reputation: 606

If you would like get list of files paths from a knows directories before and after, you can do this:

for r,d,f in os.walk(current_dir):
    for dir in d:
        if dir =='after':
                after_dir = os.path.abspath(os.path.join(current_dir, dir))
                for r,d,f in os.walk(after_dir): 
                    after_flist.append([os.path.join(r,file)for file in f if file.endswith('json')])
                              
        elif dir =='before': 
                before_dir = os.path.abspath(os.path.join(current_dir, dir))
                for r,d,f in os.walk(before_dir):
                    before_flist.append([os.path.join(r,file)for file in f if file.endswith('json')])

I know there are better answers, this is simple code I felt.

Upvotes: 0

Tatarize
Tatarize

Reputation: 10806

You can put any generator into a list. And while you can't combine generators, you can combine lists. The cons of this is you actually created 3 lists in memory but the pros are that this is very readable, requires no imports, and is a single line idiom.

Solution for the OP.

for directory, dirs, files in list(os.walk(directory_1)) + list(os.walk(directory_2)):
    do_something()
a = range(20)
b = range(10,99,3)
for v in list(a) + list(b):
    print(v) 

Upvotes: -1

andrew pate
andrew pate

Reputation: 4299

With itertools.chain.from_iterable you can do things like:

def genny(start):
  for x in range(start, start+3):
    yield x

y = [1, 2]
ab = [o for o in itertools.chain.from_iterable(genny(x) for x in y)]
print(ab)

Upvotes: 13

Luca Di Liello
Luca Di Liello

Reputation: 1643

I would say that, as suggested in comments by user "wjandrea", the best solution is

def concat_generators(*gens):
    for gen in gens:
        yield from gen

It does not change the returned type and is really Pythonic.

Upvotes: 2

user5994461
user5994461

Reputation: 7088

2020 update: Work in both Python 3 and Python 2

import itertools

iterA = range(10,15)
iterB = range(15,20)
iterC = range(20,25)

first option

for i in itertools.chain(iterA, iterB, iterC):
    print(i)

# 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

alternative option, introduced in python 2.6

for i in itertools.chain.from_iterable( [iterA, iterB, iterC] ):
    print(i)

# 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

itertools.chain() is the basic.

itertools.chain.from_iterable() is handy if you have an iterable of iterables. For example a list of files per subdirectory like [ ["src/server.py", "src/readme.txt"], ["test/test.py"] ].

Upvotes: 8

Philipp
Philipp

Reputation: 49822

itertools.chain() should do it. It takes multiple iterables and yields from each one by one, roughly equivalent to:

def chain(*iterables):
    for it in iterables:
        for element in it:
            yield element

Usage example:

from itertools import chain

g = (c for c in 'ABC')  # Dummy generator, just for example
c = chain(g, 'DEF')  # Chain the generator and a string
for item in c:
    print(item)

Output:

A
B
C
D
E
F

Upvotes: 378

Camion
Camion

Reputation: 1374

If you just need to do it once and do not wish to import one more module, there is a simple solutions...

just do:

for dir in directory_1, directory_2:
    for directory, dirs, files in os.walk(dir):
        do_something()

If you really want to "join" both generators, then do :

for directory, dirs, files in (
        x for osw in [os.walk(directory_1), os.walk(directory_2)] 
               for x in osw
        ):
    do_something()

Upvotes: -2

Milosz
Milosz

Reputation: 3074

(Disclaimer: Python 3 only!)

Something with syntax similar to what you want is to use the splat operator to expand the two generators:

for directory, dirs, files in (*os.walk(directory_1), *os.walk(directory_2)):
    do_something()

Explanation:

This effectively performs a single-level flattening of the two generators into an N-tuple of 3-tuples (from os.walk) that looks like:

((directory1, dirs1, files1), (directory2, dirs2, files2), ...)

Your for-loop then iterates over this N-tuple.

Of course, by simply replacing the outer parentheses with brackets, you can get a list of 3-tuples instead of an N-tuple of 3-tuples:

for directory, dirs, files in [*os.walk(directory_1), *os.walk(directory_2)]:
    do_something()

This yields something like:

[(directory1, dirs1, files1), (directory2, dirs2, files2), ...]

Pro:

The upside to this approach is that you don't have to import anything and it's not a lot of code.

Con:

The downside is that you dump two generators into a collection and then iterate over that collection, effectively doing two passes and potentially using a lot of memory.

Upvotes: 2

sol25
sol25

Reputation: 149

One can also use unpack operator *:

concat = (*gen1(), *gen2())

NOTE: Works most efficiently for 'non-lazy' iterables. Can also be used with different kind of comprehensions. Preferred way for generator concat would be from the answer from @Uduse

Upvotes: 4

Uduse
Uduse

Reputation: 1581

In Python (3.5 or greater) you can do:

def concat(a, b):
    yield from a
    yield from b

Upvotes: 95

user1767754
user1767754

Reputation: 25094

Simple example:

from itertools import chain
x = iter([1,2,3])      #Create Generator Object (listiterator)
y = iter([3,4,5])      #another one
result = chain(x, y)   #Chained x and y

Upvotes: 41

Mahdi Ghelichi
Mahdi Ghelichi

Reputation: 1160

Lets say that we have to generators (gen1 and gen 2) and we want to perform some extra calculation that requires the outcome of both. We can return the outcome of such function/calculation through the map method, which in turn returns a generator that we can loop upon.

In this scenario, the function/calculation needs to be implemented via the lambda function. The tricky part is what we aim to do inside the map and its lambda function.

General form of proposed solution:

def function(gen1,gen2):
        for item in map(lambda x, y: do_somethin(x,y), gen1, gen2):
            yield item

Upvotes: 0

DivideByZero
DivideByZero

Reputation: 151

If you want to keep the generators separate but still iterate over them at the same time you can use zip():

NOTE: Iteration stops at the shorter of the two generators

For example:

for (root1, dir1, files1), (root2, dir2, files2) in zip(os.walk(path1), os.walk(path2)):

    for file in files1:
        #do something with first list of files

    for file in files2:
        #do something with second list of files

Upvotes: 2

Cesio
Cesio

Reputation: 1187

A example of code:

from itertools import chain

def generator1():
    for item in 'abcdef':
        yield item

def generator2():
    for item in '123456':
        yield item

generator3 = chain(generator1(), generator2())
for item in generator3:
    print item

Upvotes: 117

Related Questions