Reputation: 198607
For instance, if I create an iterator using chain
, can I call it on multiple threads? Note that thread-safety that relies on the GIL is acceptable, but not preferable.
(Note that this is a bit different from this question, which deals with generators, not iterators written in C).
Upvotes: 16
Views: 3544
Reputation: 79
CPython-3.8, https://github.com/python/cpython/blob/v3.8.1/Modules/itertoolsmodule.c#L4129
static PyTypeObject count_type = {
PyVarObject_HEAD_INIT(NULL, 0)
"itertools.count", /* tp_name */
sizeof(countobject), /* tp_basicsize */
0, /* tp_itemsize */
/* methods */
(destructor)count_dealloc, /* tp_dealloc */
0, /* tp_vectorcall_offset */
0, /* tp_getattr */
0, /* tp_setattr */
0, /* tp_as_async */
(reprfunc)count_repr, /* tp_repr */
0, /* tp_as_number */
0, /* tp_as_sequence */
0, /* tp_as_mapping */
0, /* tp_hash */
0, /* tp_call */
0, /* tp_str */
PyObject_GenericGetAttr, /* tp_getattro */
0, /* tp_setattro */
0, /* tp_as_buffer */
Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC |
Py_TPFLAGS_BASETYPE, /* tp_flags */
itertools_count__doc__, /* tp_doc */
(traverseproc)count_traverse, /* tp_traverse */
0, /* tp_clear */
0, /* tp_richcompare */
0, /* tp_weaklistoffset */
PyObject_SelfIter, /* tp_iter */
(iternextfunc)count_next, /* tp_iternext */
count_methods, /* tp_methods */
0, /* tp_members */
0, /* tp_getset */
0, /* tp_base */
0, /* tp_dict */
0, /* tp_descr_get */
0, /* tp_descr_set */
0, /* tp_dictoffset */
0, /* tp_init */
0, /* tp_alloc */
itertools_count, /* tp_new */
PyObject_GC_Del, /* tp_free */
};
// ... ... ...
static PyObject *
count_nextlong(countobject *lz)
{
PyObject *long_cnt;
PyObject *stepped_up;
long_cnt = lz->long_cnt;
if (long_cnt == NULL) {
/* Switch to slow_mode */
long_cnt = PyLong_FromSsize_t(PY_SSIZE_T_MAX);
if (long_cnt == NULL)
return NULL;
}
assert(lz->cnt == PY_SSIZE_T_MAX && long_cnt != NULL);
stepped_up = PyNumber_Add(long_cnt, lz->long_step);
if (stepped_up == NULL)
return NULL;
lz->long_cnt = stepped_up;
return long_cnt;
}
static PyObject *
count_next(countobject *lz)
{
if (lz->cnt == PY_SSIZE_T_MAX)
return count_nextlong(lz);
return PyLong_FromSsize_t(lz->cnt++);
}
because there is no place between stepped_up = PyNumber_Add(long_cnt, lz->long_step);
and lz->long_cnt = stepped_up;
(or inside this PyNumber_Add()
) where threads could be switched. it was a so colled "slow mode".
in a "fast mode" the construnction PyLong_FromSsize_t(lz->cnt++)
is obvously atomically.
the other part of threadsafing is provided by GIL:
threads switching happens in some points when python-bytecode runs. and in i/o-functions.
memory fences for elimination memory reorder side effects
Upvotes: 2
Reputation: 42277
Firstly, nothing in the official documentation on itertools say that they're thread-safe. So it seems that by specification Python does not guarantee anything about that. This might be different across implementations such as Jython or PyPy, but this means your code probably wont be portable.
Secondly, most itertools
(with exception of simple ones, like count
) take other iterators as their input. You'd need these iterators to also behave correctly in a thread-safe way.
Thirdly, some iterators might not make sense when used simultaneously by different threads. For example izip
working in multiple threads might get into race condition taking elements from multiple sources, especially as defined by equivalent python code (what will happen when one thread will manage to take value from only one input iterator, then second thread from two of them?).
Also note that the documentation does not mention that itertools
are implemented in C. We know (as an implementation detail) that CPython's itertools
are actually written in C, but on other implementations they can happily be implemented as generators, and you can go back to the question you cited.
So, no, you cannot assume that they are thread-safe unless you know implementation details of your target python platform.
Upvotes: 16