jojo
jojo

Reputation: 3609

Failed to convert UTF-16 string buffer created in C++ into python using SWIG

I have a C++ function I want to use in Python:

struct DataWrap
{
    wchar_t* data;
    int size;
}

extern int fun1( const char* szInBuffer, unsigned int inSize,DataWrap* buff);

fun1 takes an array of chars given in szInBuffer with size of inSize and returns a newly allocated UTF-16 string in dataWrap->data

I would like to use this function fun1 from Python.

%typemap(argout) DataWrap* buff{
                int byteorder = -1;
                $input = PyUnicode_DecodeUTF16( (const char*)($1->data), ($1->size)*sizeof(wchar_t) , NULL, &byteorder);
}

this does work, but there is a memory leak here as the dataWrap->data allocated by fun1 is never released.

I tried to fix that like this:

%typemap(argout) DataWrap* buff{
            int byteorder = -1;
            $input = PyUnicode_DecodeUTF16( (const char*)($1->data), ($1->size)*sizeof(wchar_t) , NULL, &byteorder);
            delete $1;
        }

and now the code crashes.

What am I doing wrong? Is that the correct why to work with SWIG in order to get a UTF-16 string into Python?

Upvotes: 1

Views: 445

Answers (1)

Mark Tolonen
Mark Tolonen

Reputation: 177765

Try the freearg typemap:

%typemap(freearg) DataWrap *buff {
    delete $1->data;
}

Also, you had delete $1;. Should it be delete $1->data?

Edit

I discovered another issue when playing with this problem that may be related to your issue. If you are on Windows and your implementation of fun1 is in a different DLL than the SWIG wrapper code, make sure to make both DLLs link to the same C Runtime DLL (via the /MD compiler switch in Microsoft VC++, for example), since the memory would then be allocated (with new I assume) in one DLL and freed in another.

Here's the code I was playing with. It may help you. It also includes a typemap to eliminate sending the string and size from Python and a typemap to generate a temporary DataWrap object so it doesn't have to be passed as well. In the implementation file x.cpp I faked an implementation of converting the string to UTF16 that only works with simple ASCII strings for something to work with.

Also note I got it to work without using the freearg typemap, although using the typemap worked for me as well.

makefile

_x.pyd: x_wrap.cxx x.dll
    cl /MD /nologo /LD /Zi /EHsc /W4 x_wrap.cxx /Id:\dev\python27\include -link /LIBPATH:d:\dev\python27\libs /OUT:_x.pyd x.lib

x.dll: x.cpp x.h
    cl /MD /nologo /Zi /LD /EHsc /W4 x.cpp

x_wrap.cxx: x.h x.i
    swig -python -c++ x.i

x.i

%module x

%begin %{
#pragma warning(disable:4127 4211 4706)
%}

%{
    #include "x.h"
%}

%include <windows.i>

%typemap(in) (const char *szInBuffer,unsigned int inSize) {
   if (!PyString_Check($input)) {
       PyErr_SetString(PyExc_ValueError, "Expecting a string");
       return NULL;
   }
   $1 = PyString_AsString($input);
   $2 = PyString_Size($input);
}

%typemap(in, numinputs=0) DataWrap* buff (DataWrap temp) {
   $1 = &temp;
}

%typemap(argout) DataWrap* buff {
    int byteorder = -1;
    $result = PyUnicode_DecodeUTF16((const char*)($1->data), ($1->size)*sizeof(wchar_t), NULL, &byteorder);
    delete [] $1->data;
}

%include "x.h"

x.h

#ifdef API_EXPORTS
#   define API __declspec(dllexport)
#else
#   define API __declspec(dllimport)
#endif

struct DataWrap
{
    wchar_t* data;
    int size;
};

extern "C" API void fun1(const char* szInBuffer, unsigned int inSize, DataWrap* buff);

x.cpp

#include <stdlib.h>

#define API_EXPORTS
#include "x.h"

API void fun1(const char* szInBuffer, unsigned int inSize, DataWrap* buff)
{
    unsigned int i;
    buff->size = inSize;
    buff->data = new wchar_t[inSize];
    for(i = 0; i < inSize; i++)
        buff->data[i] = szInBuffer[i];
}

Output

Python 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import x
>>> x.fun1('abc')
u'abc'

Upvotes: 2

Related Questions