Mike Edwards
Mike Edwards

Reputation: 3771

Python Generator Indexing Optimization

Let's say I have a generator that I want to pull the 10th element from but ignore the first 9. The generator is a function I've written that looks something like this:

def myGenerator(arg1, arg2):
    for i in arg1:
        myState = doSomeWork(i, arg2)
        yield expensiveOperation(myState)

Now I can use it and grab the 10th index out of it like this:

myGen = myGenerator(list1, list2)
tenthElement = next(itertools.islice(myGen,10,11))

I'm wondering if this runs expensiveOperation ten times, or just once? (EDIT: it calls it 10 times, but this next part is what I'm interested in.) Is there any way to optimize away the other 9 calls to expensiveOperation since they are discarded? (edited for clarity)

I can think of several other solutions that don't involve using a generator function and would give exactly what I want, but the syntax isn't as clean as just turning an iterative function into a generator by replacing return with yield.

EDIT: I'm not necessarily trying to solve this specific problem so much as looking for a way to inexpensively "scroll" a generator. In the real case I'm currently working with, I don't actually know which index I want when I call myGenerator for the first time. I may grab the 15th index, then the 27th, then the 82nd. I could probably figure a way to slice to the 15th on the first call, but then I need to scroll 12 more the next time around.

Upvotes: 3

Views: 824

Answers (4)

Ethan Furman
Ethan Furman

Reputation: 69150

Generators are meant to be consumed one item at a time. While it takes more work to create, what you should be using in your case is an iterable:

class myIterable():
    def __init__(self, arg1, arg2):
        self.arg1 = arg1
        self.arg2 = arg2
    def __getitem__(self, index):
        myState = doSomeWork(self.arg1[index], self.arg2)
        return expensiveOperation(myState)

myIter = myIterable(list1, list2)
tenthElement = myIter[10]

You'll need to add more code to __getitem__ if you want to support slices and negative indexing.

Upvotes: 0

Winston Ewert
Winston Ewert

Reputation: 45059

There is no way for python to know that the expensive operation can be skipped. For example, it might have side effects that need to happen. So there is no way of fast-forwarding a generator.

One option:

def myGenerator(arg1, arg2):
    for i in arg1:
        myState = doSomeWork(i, arg2)
        yield functools.partial(expensiveOperation, myState)

This will return a callable object instead of the actual value. To get the actual value, you call the yielded value. Only then will the expensive operations be performed.

Upvotes: 4

Steven Rumbalski
Steven Rumbalski

Reputation: 45552

Let's see what happens:

def expensive_operation(x):
    print 'expensive operation', x
    return x

def myGenerator():
    for i in xrange(1000):
        yield expensive_operation(i)

myGen = myGenerator()
tenthElement = next(itertools.islice(myGen,10,11))
print 'tenthElement', tenthElement

prints

expensive operation 0
expensive operation 1
expensive operation 2
expensive operation 3
expensive operation 4
expensive operation 5
expensive operation 6
expensive operation 7
expensive operation 8
expensive operation 9
expensive operation 10
tenthElement 10

Best would be to decouple expensiveOperation from myGenerator since your code suggests that expensiveOperation does not affect myState.

def myGenerator(arg1, arg2):
    for i in arg1:
        myState = doSomeWork(i, arg2)
        yield myState

Then apply the expensiveOperation only when you want it.

Upvotes: 1

Raymond Hettinger
Raymond Hettinger

Reputation: 226524

The generator is isolated from its consumer -- it doesn't know what is being throw away. So, yes, it does the expensive operation at every step.

I would just move the expensive operation outside the generator:

def myGenerator(arg1, arg2):
    for i in arg1:
        myState = doSomeWork(i, arg2)
        yield myState

myGen = myGenerator(list1, list2)
tenthElement = expensiveOperation(next(itertools.islice(myGen,10,11)))

Upvotes: 5

Related Questions