GabrielT
GabrielT

Reputation: 33

C++ string to python limitation using swig

I have recently bumped into a swig limitation related to the size of C++ std::string.

I have some C++ code returning a pair. I noticed that when the size of the string in the pair is smaller that 2*1024*1024*1024-64 (2GB) the pair properly returns and the string is mapped to a python native string. However if the string is greater than 2GB, then in python the string is not mapped anymore to a native python string. For example using the code below, and mapping through swig to python you can reproduce my error.

Environment:: SWIG Version 3.0.8, Ubuntu 16.04.3 LTS, g++ 5.4.0; Python 2.7.12

/////////// bridge.h
#include <vector>
#include <utility>
#include <string>
#include <iostream>
#include <fstream>
using namespace std;
pair<int, string> large_string(long sz);
long size_pstring(pair<int,string>& p);
void print_pstring(pair<int,string>& p);
string save_pstring(pair<int,string>& p);

//////////bridge.cc
#include "bridge.h"

pair<int, string> large_string(long sz){
 pair<int, string> pis;
 pis.first=20;
 pis.second=string(sz,'A');
 return pis;
}

long size_pstring(pair<int,string>& p){
 return p.second.size();
}

void print_pstring(pair<int,string>& p){
  cout<<"PSTRING: first="<<p.first<<" second.SZ="<<p.second.size()<<"\n";
  cout<<"First 100 chars: \n"<<p.second.substr(0,100)<<"\n";
}

string save_pstring(pair<int,string>& p){
  string fname="aloe.txt";
  std::ofstream ofile(fname.c_str());
  ofile<<p.second;
  ofile.close();
  return fname;
}
////////// bridge.i
%module graphdb

%include stl.i
%include "std_vector.i"

%{
#include "bridge.h"
%}
%include "bridge.h"
namespace std {
  %template(p_string)           pair<int,string>;
};
//////// makefile
all:
    swig -c++ -python bridge.i
    g++ -std=c++11 -fpic -c bridge.cc bridge_wrap.cxx -I/usr/include/python2.7/
    g++ -shared *.o -o  _graphdb.so

Bellow I include a session in python showing that it is probably just a matter of how string is mapped and that most probably an int rather long is used to represent the size of string in swig bridge code.

>>> s=graphdb.large_string(12)
>>> print s
(20, 'AAAAAAAAAAAA')
>>> s=graphdb.large_string(2*1024*1024*1024)
>>> print s
(20, <Swig Object of type 'char *' at 0x7fd4205a6090>)
>>> l=graphdb.size_pstring(s)
>>> print l
2147483648
>>> fname = graphdb.save_pstring(s)

Saving the string to a file is correct and next I can load the file to a python string correctly.

So my question: does anybody know what swig config option I should change to allow large strings to be properly mapped to native python ?

--Thx

Upvotes: 0

Views: 387

Answers (1)

Marie R
Marie R

Reputation: 309

This issue is still present in SWIG 4.3.0. It appears that the problem is caused by a SWIG wrapper that tests the string length using INT_MAX rather than SIZE_MAX. See the bug report that describes the issue and proposed solution.

Regards - Marie

Upvotes: 1

Related Questions