Javier Mr
Javier Mr

Reputation: 2210

Swig: convert return type std::string(binary) to java byte[]

My situation is that i have a C++ class (MyClass) with a method that has the following signature:

bool getSerialized(const stdString & name, std::string & serialized);

Where name is a in argument and serialized is an out argument.

I got it working by making a %extend and %ignore declarations in the 'i' file as follows:

%extend MyClass{
    std::string getSerialized(const std::string & name){
        std::string res;
        $self->getSerialized(name, res);
        return res;
};
%rename("$ignore", fullname=1) "MyClass::getSerialized";

So the method con be used from Java like:

MyClass mc = new MyClass();
String res = mc.getSerialized("test");

But now i have encountered a problem, the serialized std::string contains binary data, including the '\0' character witch indicates the end of a C String, in fact the following code shows the problem in C++:

std::string s;
s.push_back('H');
s.push_back('o');
s.push_back(0);
s.push_back('l');
s.push_back('a');
std::cout << "Length of std::string " << s.size() << std::endl;
std::cout << "CString: '" << s.c_str() << "'" << std::endl;

The code above displays:

Length of std::string 5
CString: 'Ho'

As i have seen in the wrap file generated by SWIG, the wrap method actually calls c_str(), code of wrap:

jstring jresult = 0 ;
std::string result;
result = (arg1)->getSerialized();
jresult = jenv->NewStringUTF((&result)->**c_str()**); 
return jresult;

So as expected the received String in Java gets truncated. So how can i change (presumably) my %extend function wrapper so i can return this as a byte array (byte[]), without previously knowing the length of the array. It would be great if the byteArray could be created in the SWIG layer, so i could invoke the method from Java like:

byte[] serialized = mc.getSerialized("test");

Other considerations: The use of std::string for storing binary data is given, as is the returned type that uses the Google protobuf library C++ protobuf usage

There is a very similar question, including the tittle Swig: convert return type std::string to java byte[] but there is no case for binary data, so the solution given there doesn't apply here.

Using SWIG 2.

Upvotes: 4

Views: 3454

Answers (1)

You can do what you're trying to do with a few typemaps and some JNI. I put together an example:

%module test

%include <std_string.i>

%typemap(jtype) bool foo "byte[]"
%typemap(jstype) bool foo "byte[]"
%typemap(jni) bool foo "jbyteArray"
%typemap(javaout) bool foo { return $jnicall; }
%typemap(in, numinputs=0) std::string& out (std::string temp) "$1=&temp;"
%typemap(argout) std::string& out {
  $result = JCALL1(NewByteArray, jenv, $1->size());
  JCALL4(SetByteArrayRegion, jenv, $result, 0, $1->size(), (const jbyte*)$1->c_str());
}
// Optional: return NULL if the function returned false
%typemap(out) bool foo {
  if (!$1) {
    return NULL;
  }
}

%inline %{
struct Bar {
  bool foo(std::string& out) {
    std::string s;
    s.push_back('H');
    s.push_back('o');
    s.push_back(0);
    s.push_back('l');
    s.push_back('a');
    out = s;
    return true;
  }
};
%}

It states that the C++ wrapper with return a Java byte array for the functions that match bool foo. It also sets up a temporary std::string to give to the real implementation of foo that hides the input parameter from the Java interface itself.

Once the call has been made it creates and returns a byte array provided the function didn't return false.

I checked that it all worked as expected with:

public class run { 
  public static void main(String[] argv) {
    String s = "ho\0la";
    System.out.println(s.getBytes().length);

    System.loadLibrary("test");
    Bar b = new Bar();
    byte[] bytes = b.foo();
    s = new String(bytes);
    System.out.println(s + " - " + s.length());
    assert(s.charAt(2) == 0);
  }
}

You should be aware of the implications of the cast to const jbyte* from the return type of c_str() - it may not always be what you wanted.

As an alternative if the size of the output byte array was actually fixed or trivially predictable you could pass that in pre-allocated as the input to begin with. This would work because arrays are effectively passed by reference into functions in the first place.

Upvotes: 5

Related Questions