Reputation: 7562
I'm attempting to use SWIG to wrap a pre-existing library interface that expects the caller to manage the lifetime of some const char *
values.
struct Settings {
const char * log_file;
int log_level;
};
// The Settings struct and all members only need to be valid for the duration of this call.
int Initialize(const struct Settings* settings);
int DoStuff();
int Deinitialize();
I started off using the most basic input to SWIG to wrap the library:
%module lib
%{
#include "lib.h"
%}
%include "lib.h"
This leads to SWIG warning about a potential memory leak:
lib.h(2) : Warning 451: Setting a const char * variable may leak memory.
Which is entirely understandable as looking at lib_wrap.c
, SWIG has generated code that will malloc
a buffer into the log_file
value but never frees it:
SWIGINTERN PyObject *_wrap_Settings_log_file_set(PyObject *SWIGUNUSEDPARM(self), PyObject *args) {
PyObject *resultobj = 0;
struct Settings *arg1 = (struct Settings *) 0 ;
char *arg2 = (char *) 0 ;
void *argp1 = 0 ;
int res1 = 0 ;
int res2 ;
char *buf2 = 0 ;
int alloc2 = 0 ;
PyObject *swig_obj[2] ;
if (!SWIG_Python_UnpackTuple(args, "Settings_log_file_set", 2, 2, swig_obj)) SWIG_fail;
res1 = SWIG_ConvertPtr(swig_obj[0], &argp1,SWIGTYPE_p_Settings, 0 | 0 );
if (!SWIG_IsOK(res1)) {
SWIG_exception_fail(SWIG_ArgError(res1), "in method '" "Settings_log_file_set" "', argument " "1"" of type '" "struct Settings *""'");
}
arg1 = (struct Settings *)(argp1);
res2 = SWIG_AsCharPtrAndSize(swig_obj[1], &buf2, NULL, &alloc2);
if (!SWIG_IsOK(res2)) {
SWIG_exception_fail(SWIG_ArgError(res2), "in method '" "Settings_log_file_set" "', argument " "2"" of type '" "char const *""'");
}
arg2 = (char *)(buf2);
if (arg2) {
size_t size = strlen((const char *)((const char *)(arg2))) + 1;
arg1->log_file = (char const *)(char *)memcpy(malloc((size)*sizeof(char)), arg2, sizeof(char)*(size));
} else {
arg1->log_file = 0;
}
resultobj = SWIG_Py_Void();
if (alloc2 == SWIG_NEWOBJ) free((char*)buf2);
return resultobj;
fail:
if (alloc2 == SWIG_NEWOBJ) free((char*)buf2);
return NULL;
}
If I change the type of log_file
to char *
then the warning goes away and it appears that multiple attempts to set the value of log_file
will no longer leak memory:
SWIGINTERN PyObject *_wrap_Settings_log_file_set(PyObject *SWIGUNUSEDPARM(self), PyObject *args) {
PyObject *resultobj = 0;
struct Settings *arg1 = (struct Settings *) 0 ;
char *arg2 = (char *) 0 ;
void *argp1 = 0 ;
int res1 = 0 ;
int res2 ;
char *buf2 = 0 ;
int alloc2 = 0 ;
PyObject *swig_obj[2] ;
if (!SWIG_Python_UnpackTuple(args, "Settings_log_file_set", 2, 2, swig_obj)) SWIG_fail;
res1 = SWIG_ConvertPtr(swig_obj[0], &argp1,SWIGTYPE_p_Settings, 0 | 0 );
if (!SWIG_IsOK(res1)) {
SWIG_exception_fail(SWIG_ArgError(res1), "in method '" "Settings_log_file_set" "', argument " "1"" of type '" "struct Settings *""'");
}
arg1 = (struct Settings *)(argp1);
res2 = SWIG_AsCharPtrAndSize(swig_obj[1], &buf2, NULL, &alloc2);
if (!SWIG_IsOK(res2)) {
SWIG_exception_fail(SWIG_ArgError(res2), "in method '" "Settings_log_file_set" "', argument " "2"" of type '" "char *""'");
}
arg2 = (char *)(buf2);
if (arg1->log_file) free((char*)arg1->log_file);
if (arg2) {
size_t size = strlen((const char *)(arg2)) + 1;
arg1->log_file = (char *)(char *)memcpy(malloc((size)*sizeof(char)), (const char *)(arg2), sizeof(char)*(size));
} else {
arg1->log_file = 0;
}
resultobj = SWIG_Py_Void();
if (alloc2 == SWIG_NEWOBJ) free((char*)buf2);
return resultobj;
fail:
if (alloc2 == SWIG_NEWOBJ) free((char*)buf2);
return NULL;
}
However it still appears that the memory allocated for log_file
will be leaked when the Settings
object is garbage collected in Python.
What is the recommended way of managing lifetimes of char *
struct values in SWIG in a way which avoids these memory leaks?
Upvotes: 3
Views: 1617
Reputation: 88711
Strings are a bit awkward to do right here. There are several ways to side-step the issue you're seeing. Simplest is to use a fixed size array in the struct, but it's 2019. Personally I'd wholeheartedly recommend using idiomatic C++ instead (it's 2019!), which would mean std::string
and then the whole issue evaporates.
Failing that you're stuck in a case where to make the interface Pythonic you'll have to do some extra work. We can keep the total amount of work low and the nice thing about SWIG is that we can pick and choose where we target the extra effort we make, there's no "all or nothing". The main problem here is that we want to tie the lifespan of the buffer the log_file
path is stored in to the lifespan of the Python Settings
object itself. We can achieve that in multiple different ways depending on your preference for writing Python code, C or Python C API calls.
What we can't really solve is the case were you're given a borrowed pointer to a Settings
struct by some other code (i.e. it's not owned/managed by Python) and you want to change log_file
string in that borrowed object. The API you've got doesn't really give us a way to do that, but it seems like this isn't a case that really matters in your current module.
So without further ado below are a few options for tying the lifespan of a buffer that holds your string to a Python object that points to the buffer.
Option #1: Make Settings
wholly or partially immutable, use a single malloc
call to hold both the struct itself and the string it refers to. For this use case that's probably my preferred option.
We can do that fairly simply by giving the Settings
type a constructor in Python which handles this and it doesn't force you to use C++:
%module lib
%{
#include "lib.h"
%}
// Don't let anybody change this other than the ctor
%immutable Settings::log_file;
%include "lib.h"
%extend Settings {
Settings(const char *log_file) {
assert(log_file); // TODO: handle this properly
// Single allocation for both things means the single free() is sufficient and correct
struct Settings *result = malloc(strlen(log_file) + 1 + sizeof *result);
char *buf = (void*)&result[1];
strcpy(buf, log_file);
result->log_file = buf;
return result;
}
}
If you wanted to make the path mutable you could write a little extra Python code that wraps this up and acts a proxy which creates a new immutable object every time you "mutate" it on the Python side. You could also go the other way and make the other members of settings immutable. (Thinking about it some more it'd be neat if SWIG could optionally auto synthesize a kwargs constructor for aggregate/POD types and wouldn't be too hard to add that as a patch).
This is my personal preference here, I like immutable things and overall it's a fairly small tweak to the generated interface to get something sane.
Option #2a: Make another Python object that manages the lifespan of the string buffer and then "stash" a reference to that inside the Python side of every Settings
struct that's owned by Python.
%module lib
%{
#include "lib.h"
%}
%typemap(in) const char *log_file %{
// Only works for Python owned objects:
assert(SWIG_Python_GetSwigThis($self)->own & SWIG_POINTER_OWN); // TODO: exception...
// Python 2.7 specific, 3 gets more complicated, use bytes buffers instead.
$1 = PyString_AsString($input);
assert($1); // TODO: errors etc.
// Force a reference to the original input string to stick around to keep the pointer valid
PyObject_SetAttrString($self, "_retained_string", $input);
%}
%typemap(memberin) const char *log_file %{
// Because we trust the in typemap has retained the pointer for us this is sufficient now:
$1 = $input;
%}
%include "lib.h"
These typemaps work together to keep a reference to the PyObject
string stashed inside the Settings
PyObject
as an attribute. It only works safely here because a) we assume Python owns the object, and we're not using -builtin
in SWIG, so we can safely stash things in attributes to keep them around and b) because it's const char *
, not char *
we can be pretty sure that (unless there's some K&R silliness going on) that nobody will be changing the buffer.
Option #2b: The general idea is the same, but instead of using typemaps, which means writing Python C API calls use something like this:
%extend Settings {
%pythoncode {
@property
# ....
}
}
To do the same thing. Similar code could also be produced using %pythonprepend
instead if preferred. However this is my least preferred solution here, so I've not fully fleshed it out.
Upvotes: 2
Reputation: 3778
You can tell SWIG to use char*
semantics for log_file
. Unfortunately, it doesn't seem possible to use Settings::log_file
(the required memberin
does not show up in the pattern matching), so there could be clashes if that data member name is used in other structs as well with the same type but different semantics. This would look like:
%module lib
%{
#include "lib.h"
%}
%typemap(out) char const *log_file = char *;
%typemap(memberin) char const *log_file = char *;
%extend Settings {
Settings() {
Settings* self = new Settings{};
self->log_file = nullptr;
self->log_level = 0;
return self;
}
~Settings() {
delete[] self->log_file; self->log_file = nullptr;
delete self;
}
}
%include "lib.h"
(Note that SWIG in my case produces delete[]
, not free()
.)
EDIT: added a custom destructor to delete the log_file memory on garbage collection. (And for good measure also a constructor to make sure that an uninitialized log_file
is nullptr
, not some random memory.) What this does, is add an internal function delete_Settings
to the wrapper file, which gets called in _wrap_delete_Settings
, which is called on object destruction. Yes, syntax is a bit odd, b/c you're effectively describing Python's __del__
(taking a self
), only labeled as a C++ destructor.
Upvotes: 2