12. Thread Safety¶
If your Extension is likely to be exposed to a multi-threaded environment then you need to think about thread safety. I had this problem in a separate project which was a C++ SkipList which could contain an ordered list of arbitrary Python objects. The problem in a multi-threaded environment was that the following sequence of events could happen:
Thread A tries to insert a Python object into the SkipList. The C++ code searches for a place to insert it preserving the existing order. To do so it must call back into Python code for the user defined comparison function (using
functools.total_ordering
for example).At this point the Python interpreter is free to make a context switch allowing thread B to, say, remove an element from the SkipList. This removal may well invalidate C++ pointers held by thread A.
When the interpreter switches back to thread A it accesses an invalid pointer and a segfault happens.
The solution, of course, is to use a lock to prevent a context switch until A has completed its insertion, but how? I found the existing Python documentation misleading and I couldn’t get it to work reliably, if at all. It was only when I stumbled upon the source code for the bz module that I realised there was a whole other, low level way of doing this, largely undocumented.
Note
Your Python may have been compiled without thread support in which case we don’t have to concern ourselves with thread locking. We can discover this from the presence of the macro WITH_THREAD
so all our thread support code is conditional on the definition of this macro.
12.1. Coding up the Lock¶
First we need to include pythread.h as well as the usual includes:
#include <Python.h>
#include "structmember.h"
#ifdef WITH_THREAD
#include "pythread.h"
#endif
12.1.1. Adding a PyThread_type_lock
to our object¶
Then we add a PyThread_type_lock
(an opaque pointer) to the Python structure we are intending to protect. I’ll use the example of the SkipList source code. Here is a fragment with the important lines highlighted:
typedef struct {
PyObject_HEAD
/* Other stuff here... */
#ifdef WITH_THREAD
PyThread_type_lock lock;
#endif
} SkipList;
12.1.2. Creating a class to Acquire and Release the Lock¶
Now we add some code to acquire and release the lock. We can do this in a RAII fashion in C++ where the constructor blocks until the lock is acquired and the destructor releases the lock. The important lines are highlighted:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | #ifdef WITH_THREAD
/* A RAII wrapper around the PyThread_type_lock. */
class AcquireLock {
public:
AcquireLock(SkipList *pSL) : _pSL(pSL) {
assert(_pSL);
assert(_pSL->lock);
if (! PyThread_acquire_lock(_pSL->lock, NOWAIT_LOCK)) {
Py_BEGIN_ALLOW_THREADS
PyThread_acquire_lock(_pSL->lock, WAIT_LOCK);
Py_END_ALLOW_THREADS
}
}
~AcquireLock() {
assert(_pSL);
assert(_pSL->lock);
PyThread_release_lock(_pSL->lock);
}
private:
SkipList *_pSL;
};
#else
/* Make the class a NOP which should get optimised out. */
class AcquireLock {
public:
AcquireLock(SkipList *) {}
};
#endif
|
The code that acquires the lock is slightly clearer if the Py_BEGIN_ALLOW_THREADS and Py_END_ALLOW_THREADS macros are fully expanded 1:
if (! PyThread_acquire_lock(_pSL->lock, NOWAIT_LOCK)) {
{
PyThreadState *_save;
_save = PyEval_SaveThread();
PyThread_acquire_lock(_pSL->lock, WAIT_LOCK);
PyEval_RestoreThread(_save);
}
}
12.1.3. Initialising and Deallocating the Lock¶
Now we need to set the lock pointer to NULL
in the _new
function:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | static PyObject *
SkipList_new(PyTypeObject *type, PyObject * /* args */, PyObject * /* kwargs */) {
SkipList *self = NULL;
self = (SkipList *)type->tp_alloc(type, 0);
if (self != NULL) {
/*
* Initialise other struct SkipList fields...
*/
#ifdef WITH_THREAD
self->lock = NULL;
#endif
}
return (PyObject *)self;
}
|
In the __init__
method we allocate the lock by calling PyThread_allocate_lock()
2 A lot of this code is specific to the SkipList but the lock allocation code is highlighted:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 | static int
SkipList_init(SkipList *self, PyObject *args, PyObject *kwargs) {
int ret_val = -1;
PyObject *value_type = NULL;
PyObject *cmp_func = NULL;
static char *kwlist[] = {
(char *)"value_type",
(char *)"cmp_func",
NULL
};
assert(self);
#ifdef WITH_THREAD
self->lock = PyThread_allocate_lock();
if (self->lock == NULL) {
PyErr_SetString(PyExc_MemoryError, "Unable to allocate thread lock.");
goto except;
}
#endif
/*
* Much more stuff here...
*/
assert(! PyErr_Occurred());
assert(self);
assert(self->pSl_void);
ret_val = 0;
goto finally;
except:
assert(PyErr_Occurred());
Py_XDECREF(self);
ret_val = -1;
finally:
return ret_val;
}
|
When deallocating the object we should free the lock pointer with PyThread_free_lock
3:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | static void
SkipList_dealloc(SkipList *self) {
/*
* Deallocate other fields here...
*/
#ifdef WITH_THREAD
if (self->lock) {
PyThread_free_lock(self->lock);
self->lock = NULL;
}
#endif
Py_TYPE(self)->tp_free((PyObject*)self);
}
}
|
12.1.4. Using the Lock¶
Before any critical code we create an AcquireLock
object which blocks until we have the lock. Once the lock is obtained we can make any calls, including calls into the Python interpreter without preemption. The lock is automatically freed when we exit the code block:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | static PyObject *
SkipList_insert(SkipList *self, PyObject *arg) {
assert(self && self->pSl_void);
/* Lots of stuff here...
*/
{
AcquireLock _lock(self);
/* We can make calls here, including calls back into the Python
* interpreter knowing that the interpreter will not preempt us.
*/
try {
self->pSl_object->insert(arg);
} catch (std::invalid_argument &err) {
// Thrown if PyObject_RichCompareBool returns -1
// A TypeError should be set
if (! PyErr_Occurred()) {
PyErr_SetString(PyExc_TypeError, err.what());
}
return NULL;
}
/* Lock automatically released here. */
}
/* More stuff here...
*/
Py_RETURN_NONE;
}
|
And that is pretty much it.
Footnotes
- 1
I won’t pretend to understand all that is going on here, it does work however.
- 2
What I don’t understand is why putting this code in the
SkipList_new
function does not work, the lock does not get initialised and segfaults typically in_pthread_mutex_check_init
. The order has to be: set the lock pointer NULL in_new
, allocate it in_init
, free it in_dealloc
.- 3
A potiential weakness of this code is that we might be deallocating the lock whilst the lock is acquired which could lead to deadlock. This is very much implementation defined in
pythreads
and may vary from platform to platform. There is no obvious API inpythreads
that allows us to determine if a lock is held so we can release it before deallocation. I notice that in the Python threading module (Modules/_threadmodule.c) there is an additionalchar
field that acts as a flag to say when the lock is held so that thelock_dealloc()
function in that module can release the lock before freeing the lock.