user3758232
user3758232

Reputation: 862

C `FILE` stream from Python BufferedIO object

I am writing a Python binding for a C library function that requires a FILE * handle as an input.

I want the Python caller to pass an open io.BufferedReader object to the function, so as to retain control of the handle, e.g.:

with open(fname, 'rb') as fh:
    my_c_function(fh)

Therefore, I don't want to pass a file name and open the handle inside the C function.

My C wrapper would roughly look like this:

PyObject *my_c_function (PyObject *self, PyObject *args)
{
    FILE *fh;
    if (! PyArgs_ParseTuple (args, "?", &fh)) return NULL;
    my_c_lib_function (fh);
    // [...]
}

Obviosuly I can't figure out what symbol I should use for "?", or whether I should use a different method than PyArgs_ParseTuple. The Python C API documentation does not seem to provide any example on how to deal with buffered IO objects (from what I understand, the Buffer protocol applies to bytes objects and co.... right?)

It seems like I could look into the file descriptor of the Python handle object within my C wrapper (as if calling fileno()) and create a C file handle from that using fdopen().

A couple of questions:

  1. Is this the most convenient way? Or is there a built-in method in the Python C API that I did not see?
  2. The fileno() documentation mentions: "Return the underlying file descriptor (an integer) of the stream if it exists. An OSError is raised if the IO object does not use a file descriptor." In which case would that happen? What if I pass a file handle created in Python by other than open()?
  3. It seems pretty safe to open a read-only C handle on a read-only fd opened by Python, which should be guaranteed to close the handle after the C function; however, can anybody think of any pitfalls to this approach?

Upvotes: 0

Views: 406

Answers (1)

user3758232
user3758232

Reputation: 862

Not sure if this is the most reasonable way, but I resolved it in Linux in the following way:

static PyObject *
get_fh_from_python_fh (PyObject *self, PyObject *args)
{
    PyObject *buf, *fileno_fn, *fileno_obj, *fileno_args;
    if (! PyArg_ParseTuple (args, "O", &buf)) return NULL;

    // Get the file descriptor from the Python BufferedIO object.
    // FIXME This is not sure to be reliable. See
    // https://docs.python.org/3/library/io.html#io.IOBase.fileno
    if (! (fileno_fn = PyObject_GetAttrString (buf, "fileno"))) {
        PyErr_SetString (PyExc_TypeError, "Object has no fileno function.");
        return NULL;
    }
    fileno_args = PyTuple_New(0);
    if (! (fileno_obj = PyObject_CallObject (fileno_fn, fileno_args))) {
        PyErr_SetString (PyExc_SystemError, "Error calling fileno function.");
        return NULL;
    }
    int fd = PyLong_AsSize_t (fileno_obj);

    /*
     * From the Linux man page:
     *
     * > The file descriptor is not dup'ed, and will be closed when the stream
     * > created by fdopen() is closed. The result of applying fdopen() to a
     * > shared memory object is undefined.
     *
     * This handle must not be closed. Leave open for the Python caller to
     * handle it.
     */
    FILE *fh = fdopen (fd, "r");

    // rest of the code...
}

This only has Linux in mind but so far it does what it needs to do. A better approach would be to gain insight into the BufferedReader object and maybe even find a FILE * in there; but if that is not part of the Python API it might be subject to breaking in future versions.

Upvotes: 1

Related Questions