michael
michael

Reputation: 3945

Python: Cast SwigPythonObject to Python Object

I'm using some closed Python module: I can call methods via API, but I cannot access the implementation. I know this module is basically wraps some C++ code.

So one of the methods return value type is a SwigPythonObject. How can I work with this object later, suppose I don't have any other aids from the module distributor nor documentation?

I want somehow to convert him to "regular" python object and observe him in the debugger for internal members structure.

Currently what I see in the debugger is something like:

{SwigPythonObject} _<hexa number>_p_unsigned_char

Upvotes: 1

Views: 3046

Answers (1)

It's a little unclear the semantics of what you're asking, but basically it seems as though you've got a pointer to an unsigned char from SWIG that you'd like to work with. Guessing slightly there are probably 3 cases you're likely to encounter this in:

  1. The pointer really is a pointer to a single unsigned byte
  2. The pointer is a pointer to a null-terminated string. (Why isn't it just wrapped as a string though?)
  3. The pointer points to a fixed length unsigned byte array. (You'll need to know/guess the length somehow)

In this particular instance because there's no packing or alignment to worry about for all three cases we can actually write something for all of the above cases that uses ctypes to read the memory that SWIG references directly into Python and side steps the SWIG proxy. (Note that if the type we were looking at was anything more complex than just a pointer to a single built in type or array of them we'd not be able to do much here)

First up some code, in C - test.h to exercise what we're working on:

inline unsigned char *test_str() {
  static unsigned char data[] = "HELLO WORLD";
  return data;
}

inline unsigned char *test_byte() {
  static unsigned char val = 66;
  return &val;
}

Next up is a minimal SWIG module that wraps this:

%module test

%{
#include "test.h"
%}

%include "test.h"

We can check this out in ipython and see that it is wrapped (similarly) to what you observed:

In [1]: import test

In [2]: test.test_byte()
Out[2]: <Swig Object of type 'unsigned char *' at 0x7fc2851cbde0>

In [3]: test.test_str()
Out[3]: <Swig Object of type 'unsigned char *' at 0x7fc2851cbe70>

In [4]: hex(int(test.test_str()))
Out[4]: '0x7f905b0e72cd'

The thing we use in each case is the fact that calling int(x) where x is our unknown SWIG unsigned char pointer gives us the value of the address the pointer is pointing at as an integer. Combining that with ctype's from_address static method we can construct ctypes instances to access the memory SWIG knows about directly. (NB: address returned by calling int() doesn't match the address in the string representation show because the former is the real address of the data pointed at, but the latter is the address of the SWIG proxy object)

Probably the simplest to wrap is the fixed length case - we can create a ctypes type by using the * operator on c_ubyte of the right size and then call from_address.

For the null-terminated string case we've got two choices really: either use the libc strlen function to figure out the string length and then construct a ctypes type that matches, or alternatively just loop char by char from Python until we hit a null. I chose the latter in my example below as it's simpler. I probably over-complicated it by using a generator and itertools.count() to track the position though.

Finally for the pointer to single byte case I basically reused the existing ctypes type I had to create a 1 byte array and read the value out of that. There's probably a way to construct a type from an address using ctypes.POINTER(ctypes.c_ubyte) and then .contents, but I couldn't quickly see it, so using the 1 byte array trick made it trivial for me.

All this combined to give me the following Python code:

import ctypes
import test
import itertools

# Case 2
def swig_to_str(s):
  base = int(s)
  ty = ctypes.c_ubyte*1
  def impl():
    for x in itertools.count():
      v=ty.from_address(base+x)[0]
      if not v: return
      yield chr(v)
  return ''.join(impl())

# Case 1
def swig_to_byte(b):
  ty=ctypes.c_ubyte*1
  v=ty.from_address(int(b))
  return v[0]

# Case 3
def swig_to_fixed_len(s, l):
  ty=ctypes.c_ubyte*l
  return ''.join(chr(x) for x in ty.from_address(int(s)))

t=test.test_str()
print(t)
print(swig_to_str(t))
print(swig_to_fixed_len(t,5))

u=test.test_byte()
print(u)
print(swig_to_byte(u))

This ran as hoped with Python 2.7 (should take minimal effort to make it correct for 3):

swig3.0 -python -Wall test.i
gcc -std=gnu99 -Wall test_wrap.c -o  _test.so -shared -I/usr/include/python2.7/ -fPIC
python run.py 

<Swig Object of type 'unsigned char *' at 0x7f4a57581cf0>
HELLO WORLD
HELLO
<Swig Object of type 'unsigned char *' at 0x7f4a57581de0>
66

Upvotes: 5

Related Questions