pelson
pelson

Reputation: 21839

Extract a list from itertools.cycle

I have a class which contains a itertools.cycle instance which I would like to be able to copy. One approach (the only one I can come up with), is to extract the initial iterable (which was a list), and store the position that the cycle is at.

Unfortunately I am unable to get hold of the list which I used to create the cycle instance, nor does there seem to be an obvious way to do it:

import itertools
c = itertools.cycle([1, 2, 3])
print dir(c)
['__class__', '__delattr__', '__doc__', '__format__', '__getattribute__', 
 '__hash__', '__init__', '__iter__', '__new__', '__reduce__', 
 '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', 
 '__subclasshook__', 'next']

I can come up with some half reasonable reasons why this would be disallowed for some types of input iterables, but for a tuple or perhaps even a list (mutability might be a problem there), I can't see why it wouldn't be possible.

Anyone know if its possible to extract the non-infinite iterable out of an itertools.cycle instance. If not, anybody know why this idea is a bad one?

Upvotes: 7

Views: 3913

Answers (4)

Alex
Alex

Reputation: 12913

Depending on how you're using cycle, you could even get away with a custom class wrapper as simple as this:

class SmartCycle:
    def __init__(self, x):
        self.cycle = cycle(x)
        self.to_list = x

    def __next__(self):
        return next(self.cycle)

e.g.

> a = SmartCycle([1, 2, 3])
> for _ in range(4):
>     print(next(a))
1
2
3
1

> a.to_list
[1, 2, 3]

Upvotes: 0

Bakuriu
Bakuriu

Reputation: 101959

It's impossible. If you look at itertools.cycle code you'll see that it does not store a copy of the sequence. It only create an iterable and store the values contained in the iterable in a newly created list:

static PyObject *
cycle_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
{
    PyObject *it;
    PyObject *iterable;
    PyObject *saved;
    cycleobject *lz;

    if (type == &cycle_type && !_PyArg_NoKeywords("cycle()", kwds))
        return NULL;

    if (!PyArg_UnpackTuple(args, "cycle", 1, 1, &iterable))
        return NULL;
    /* NOTE: they do not store the *sequence*, only the iterator */
    /* Get iterator. */
    it = PyObject_GetIter(iterable);
    if (it == NULL)
        return NULL;

    saved = PyList_New(0);
    if (saved == NULL) {
        Py_DECREF(it);
        return NULL;
    }

    /* create cycleobject structure */
    lz = (cycleobject *)type->tp_alloc(type, 0);
    if (lz == NULL) {
        Py_DECREF(it);
        Py_DECREF(saved);
        return NULL;
    }
    lz->it = it;
    lz->saved = saved;
    lz->firstpass = 0;

    return (PyObject *)lz;
}

This means that when doing:

itertools.cycle([1,2,3])

The list you create has only 1 reference, that is kept in the iterator used by cycle. When the iterator is exhausted the iterator gets deleted and a new iterator is created:

    /* taken from the "cycle.next" implementation */
    it = PyObject_GetIter(lz->saved);
    if (it == NULL)
        return NULL;
    tmp = lz->it;
    lz->it = it;
    lz->firstpass = 1;
    Py_DECREF(tmp);   /* destroys the old iterator */

Which means that after doing one cycle the list is destroyed.

Anyway if you need access to this list, just reference it somewhere before calling itertools.cycle.

Upvotes: 6

pelson
pelson

Reputation: 21839

Ok, so I have accepted @Bakuriu's answer, as it is technically correct. It is not possible to copy/pickle a itertools.cycle object.

I have implemented a subclass of itertools.cycle which is picklable (with a couple of extra bells and whistles to boot).

import itertools


class FiniteCycle(itertools.cycle):
    """
    Cycles the given finite iterable indefinitely. 
    Subclasses ``itertools.cycle`` and adds pickle support.
    """
    def __init__(self, finite_iterable):
        self._index = 0
        self._iterable = tuple(finite_iterable)
        self._iterable_len = len(self._iterable)
        itertools.cycle.__init__(self, self._iterable)

    @property
    def index(self):
        return self._index

    @index.setter
    def index(self, index):
        """
        Sets the current index into the iterable. 
        Keeps the underlying cycle in sync.

        Negative indexing supported (will be converted to a positive index).
        """
        index = int(index)
        if index < 0:
            index = self._iterable_len + index
            if index < 0:
                raise ValueError('Negative index is larger than the iterable length.')

        if index > self._iterable_len - 1:
            raise IndexError('Index is too high for the iterable. Tried %s, iterable '
                             'length %s.' % (index, self._iterable_len))

        # calculate the positive number of times the iterable will need to be moved
        # forward to get to the desired index
        delta = (index + self._iterable_len - self.index) % (self._iterable_len)

        # move the finite cycle on ``delta`` times.
        for _ in xrange(delta):
            self.next()

    def next(self):
        self._index += 1
        if self._index >= self._iterable_len:
            self._index = 0
        return itertools.cycle.next(self)

    def peek(self):
        """
        Return the next value in the cycle without moving the iterable forward.
        """
        return self._iterable[self.index]

    def __reduce__(self):
        return (FiniteCycle, (self._iterable, ), {'index': self.index})

    def __setstate__(self, state):
        self.index = state.pop('index')

Some example usage:

c = FiniteCycle([1, 2, 3])

c.index = -1
print c.next() # prints 3

print [c.next() for _ in xrange(4)] # prints [1, 2, 3, 1]

print c.peek() # prints 2
print c.next() # prints 2

import pickle
import cStringIO
serialised_cycle = pickle.dumps(c)

del c

c = pickle.loads(serialised_cycle)

print c.next() # prints 3
print c.next() # prints 1

Feedback welcome.

Thanks,

Upvotes: 0

wberry
wberry

Reputation: 19347

If you have ways of knowing certain properties of the objects being yielded by cycle then you can deduce the inner list. For example, if you know that all the objects in the cycle are distinct AND that nothing else is reading from the cycle iterator besides you, then you can simply wait for the first one you see to appear again (testing with is not ==) to terminate the inner list.

But without such knowledge, there are no guarantees, and any method you choose to guess what the cycle is will fail in certain cases.

Upvotes: 0

Related Questions