dmg
dmg

Reputation: 103

Recover defining expression for a python generator

Given a generator

g = ( <expr> for x in <iter> ),

is there any way to recover the expression and iterator used to define g?

E.g., a function that would behave like this:

expr, iter = f( ( x*x for x in range(10) ) )
expr(2) # 4
expr(5) # 25
iter[1] # 1
iter[9] # 9
iter[10] # raises IndexError

The reason I want this functionality is that I've made my own LazyList class. I want it to essentially behave like a generator, except allow access via getitem without having to iterate through k-1 elements before it can access the k'th element. Thanks.

Edit: Here's a snapshot of the lazy list class:

class LazyList(object):
  def __init__(self, iter=None, expr=None):
    if expr is None:
      expr = lambda i: i
    if iter is None:
      iter = []
    self._expr = expr
    self._iter = iter

  def __getitem__(self, key):
    if hasattr(self._iter, '__getitem__'):
      return self._expr(self._iter[key])
    else:
      return self._iter_getitem(key)

  def __iter__(self):
    for i in self._iter:
      yield self._expr(i)

I've omitted the method _iter_getitem. All this does is iterate through _iter until it reaches the key'th element (or uses itertool's islice if key is a slice). There's also the common llmap, llreduce, etc. functions I've omitted but you can probably guess how those go.

One of my motivations for wanting to be able to decompose generators is to that I can elegantly initialize this class like

l = LazyList(x*x for x in range(10))

instead of

l = LazyList(range(10), lambda x: x*x)

But the real benefit is that this would be, with polish, a nice generalization of the generator concept and be able to be used in place of any generator (with the same memory saving benefits).

I'm using this with Django a lot because it works well with their querysets. I have a lot of code that is dependent on the list structures being lazy, because it returns multidimensional arrays that, if evaluated, would fetch way more data than I'd need.

Upvotes: 2

Views: 160

Answers (2)

PaulMcG
PaulMcG

Reputation: 63739

I think your concept of a LazyList is good, but your thinking about doing direct access to a generator's n'th value is flawed. Your example in using range(10) as the sequence to iterate over is a special case, one in which all values are knowable ahead of time. But many generators are computed incrementally, in which the n'th value is computed based on the n-1'th value. A fibonacci generator is one such:

def fibonacci(n=1000):
    a,b=1,1
    yield a
    while n>0:
        n -= 1
        yield b
        a,b = b,a+b

This gives the familiar series 1, 1, 2, 3, 5, 8, ... in which the n'th item is the sum of the n-1'th and n-2'th. So there is no way to jump directly to item 10, you have to get there through items 0-9.

That being said, your LazyList is nice for a couple of reasons:

  • it allows you to revisit earlier values

  • it simulates direct access even though under the covers the generator has to go through all the incremental values until it gets to 'n'

  • it only computes the values actually required, since the generator is evaluated lazily, instead of pre-emptively computing 1000 values only to find that the first 10 are used

Upvotes: 0

Martin Geisler
Martin Geisler

Reputation: 73778

The closest I can think of is to disassemble the code object that is inside the generator expression. Something like

>>> import dis
>>> g = ( x*x for x in range(10) )
>>> dis.dis(g.gi_code)
  1           0 LOAD_FAST                0 (.0)
        >>    3 FOR_ITER                15 (to 21)
              6 STORE_FAST               1 (x)
              9 LOAD_FAST                1 (x)
             12 LOAD_FAST                1 (x)
             15 BINARY_MULTIPLY     
             16 YIELD_VALUE         
             17 POP_TOP             
             18 JUMP_ABSOLUTE            3
        >>   21 LOAD_CONST               0 (None)
             24 RETURN_VALUE        

That gives a little hint about what is happening, but tt's not very clear, IMHO.

There is another Stack Overflow question that deals with converting Python byte code into readable Python — maybe you can use that to get something more human readable back.

Upvotes: 1

Related Questions