Reputation: 152725
I try to create a representation function for a class and I want it to be python-2.x and python-3.x compatible. However I noticed that normal strings when passed to PyUnicode_FromFormat
as %U
will segfault. The only viable workaround that I found was to convert it to a unicode object myself with PyUnicode_FromObject
and then pass the result to the PyUnicode_FromFormat
:
/* key and value are arguments for the function. */
PyObject *repr;
if (PyUnicode_CheckExact(key)) {
repr = PyUnicode_FromFormat("%U=%R", key, value);
}
else {
PyObject *tmp = PyUnicode_FromObject(key);
if (tmp == NULL) {
return NULL;
}
repr = PyUnicode_FromFormat("%U=%R", tmp, value);
Py_DECREF(tmp);
}
The point is that I want the representation to be without the ""
(or ''
) that would be added if I use %R
or %S
.
I only recently found the issue and I'm using PyUnicode_FromFormat("%U", something);
all over the place so the question I have is: Can this be simplified while keeping it Python 2.x and 3.x compatible?
Upvotes: 1
Views: 873
Reputation: 30916
I don't think a very simplified way of doing what you want exists. The best I can see is to eliminate the if
statement by just using your else
case and thus always calling PyUnicode_FromObject
:
PyObject *tmp = PyUnicode_FromObject(key);
if (tmp == NULL) {
return NULL;
}
repr = PyUnicode_FromFormat("%U=%R", tmp, value);
Py_DECREF(tmp);
If you look at the implementation of PyUnicode_FromObject
you'll see the first thing it does is PyUnicode_CheckExact
and in that case it returns an incref
ed version of the original object. Therefore the extra work done is pretty small (for the case where key
is already unicode) and it should be slightly more efficient for the case where key
isn't unicode since you avoid a branch.
Upvotes: 1