Reputation: 1127
The python incref is define like this
#define Py_INCREF(op) ( \
_Py_INC_REFTOTAL _Py_REF_DEBUG_COMMA \
((PyObject *)(op))->ob_refcnt++)
With multi-core, the incrementation is only is L1 cache and not flushed to memory.
If two thread increment the refcnt at the same time, in differents core, without a flush to the real memory, for me, it's possible to lost one incrementation. - ob_refcnt=1 - Core 1 increment, but not flush => ob_refcnt=2 in L1 cache of core 1 - Core 2 increment, but not flush => ob_refcnt=2 in L1 cache of core 2 - WTF
Is it a risk to use multi-core or multi-process ?
The PyObject was declared like this:
typedef struct _object {
_PyObject_HEAD_EXTRA
Py_ssize_t ob_refcnt;
struct _typeobject *ob_type;
} PyObject
But Py_ssize_t is just a ssize_t or intptr_t.
The _Py_atomic* functions and attributes do not seem to be used.
How Python can manage this scenario ? How can it flush the cache between threads ?
Upvotes: 2
Views: 344
Reputation: 345
Why not use Lock's or Semaphore's of Python ? https://docs.python.org/2/library/threading.html
Upvotes: 0
Reputation: 30891
The CPython implementation of Python has the global interpreter lock (GIL). It is undefined behaviour to call the vast majority of Python C API functions (including Py_INCREF
) without holding this lock and will almost certainly result in inconsistent data or your program crashing.
The GIL can be released and acquired as described in the documentation.
Because of the need to hold this lock in order to operate on Python objects multithreading in Python is pretty limited, and the only operations that parallelize well are things like waiting for IO or pure C calculations on large arrays. The multiprocessing
module (that starts isolated Python processes) is another option for parallel Python.
There have been attempts to use atomic types for reference counting (to remove/minimize the need for the GIL) but these caused significant slowdowns in single-threaded code so were abandoned.
Upvotes: 6