yeputons
yeputons

Reputation: 9238

How to specify a C++ pointer to a C function with unknown number of arguments?

I'm writing a C library which I would like to be usable from both C and C++. At one moment, it should take a callback from the user with 0-3 arguments, which will be called at some pointer later. Like this (a copy of the code is available as GitHub Gist too):

// app_c.c
#include <stdio.h>
#include "lib.h"

double f0(void) {
    return 123;
}

double f2(double a, double b) {
    return a + b;
}

int main() {
    cb_arity = 0;
    cb_func = f0;
    printf("%f\n", cb_call());

    cb_arity = 2;
    cb_func = f2;
    printf("%f\n", cb_call());
}

I was able to create a pointer to a C function which takes unknown (but still fixed) number of arguments, note it's void (*cb_func)(), not void (*cb_func)(void):

// lib.h
#ifndef LIB_H_
#define LIB_H_

#ifdef __cplusplus
extern "C" {
#endif

extern int cb_arity;
extern double (*cb_func)();
double cb_call(void);

#ifdef __cplusplus
}
#endif

#endif  // LIB_H_
// lib.c
#include "lib.h"
#include <stdlib.h>

int cb_arity;
double (*cb_func)();

double cb_call(void) {
    switch (cb_arity) {
        case 0:
            return cb_func();
        case 1:
            return cb_func(10.0);
        case 2:
            return cb_func(10.0, 20.0);
        case 3:
            return cb_func(10.0, 20.0, 30.0);
        default:
            abort();
    }
}

It compiles and runs successfully both on my machine and Wandbox. As far as I understand, no UB is invoked.

Now I would like to make it work in C++ as well. Unfortunately, it looks like I now need reinterpret_cast because () means "no arguments" in C++, not "unknown number of arguments":

// app_cpp.cpp
#include <stdio.h>
#include "lib.h"

int main() {
    cb_arity = 0;
    cb_func = []() { return 123.0; };
    printf("%f\n", cb_call());

    cb_arity = 2;
    cb_func = reinterpret_cast<double(*)()>(static_cast<double(*)(double, double)>(
        [](double a, double b) { return a + b; }
    ));
    printf("%f\n", cb_call());
}

As far as I understand, no UB is invoked here as well: although I convert function pointer double(*)(double, double) to double(*)(void) in C++, it's converted back to double(*)(double, double) in C code right before calling.

Is there any way to get rid of these ugly casts in C++ code? I've tried specifying type of cb_func as void(*)(...), but C++ still won't implicitly convert double(*)(double, double) to it.

Upvotes: 4

Views: 244

Answers (2)

Caleth
Caleth

Reputation: 63019

Rather than erase the number of arguments from the callback, you could retain it.

// lib.h
#ifndef LIB_H_
#define LIB_H_

#ifdef __cplusplus
extern "C" {
#endif

typedef struct {
    int arity;
    union {
        void(*zero)(void);
        void(*one)(double);
        void(*two)(double, double);
        void(*three)(double, double, double);
    }
} cb_type;
extern cb_type cb;

double cb_call(void);

#ifdef __cplusplus
}
#endif

#endif  // LIB_H_
// lib.c
#include "lib.h"
#include <stdlib.h>

cb_type cb;

double cb_call(void) {
    switch (cb.arity) {
        case 0:
            return cb.zero();
        case 1:
            return cb.one(10.0);
        case 2:
            return cb.two(10.0, 20.0);
        case 3:
            return cb.three(10.0, 20.0, 30.0);
        default:
            abort();
    }
}

If you don't expose cb, you can't mismatch the arity and union member:

// lib.h
#ifndef LIB_H_
#define LIB_H_

#ifdef __cplusplus
extern "C" {
#endif

void register_zero(void(*)(void));
void register_one(void(*)(double));
void register_two(void(*)(double, double));
void register_three(void(*)(double, double, double));

double cb_call(void);

#ifdef __cplusplus
}
#endif

#endif  // LIB_H_

Upvotes: 4

yeputons
yeputons

Reputation: 9238

Parameterless function declarations is an obsolescent C feature. I would suggest avoiding it.

Essentially, you're trying to circumvent type checking: you want to store an almost arbitrary function in a cb_func variable, and then be able to call it freely without any signs of danger like explicit casts. However, this code is inherently dangerous: if you mess up cb_arity, behavior is undefined. Moreover, you can even store double(*)(int, char*) to cb_func without any warnings, which is never ok in your example.

On a deeper level, your cb_arity/cb_func looks a lot like a tagged union. The canonical way to implement it in C++ is std::variant (or something function-specific like unique_pseudofunction). The canonical way to implement it in C is a struct with a "tag" field and an anonymous union, like this:

struct cb_s {
    int arity;
    union {
        double (*func0)(void);
        double (*func1)(double);
        double (*func2)(double, double);
        double (*func3)(double, double, double);
    };
};

extern struct cb_s cb;

Now instead of cb_func = f0 you write cb.func0 = f0, and instead of cb_func = f2 you write cb.func2 = f2. Similary in C++ and all casts from lambdas are now gone. The only remaining sign of danger is the underlying union.

You will have to change your code in two places:

  1. The library. Here you already know which of func0/func1/... to call, not a big deal.
  2. The user code which wrote to cb_func. Now it has to know which union member to write into. Presumably, the same code would also write a compile-time constant cb_arity, so not a big deal agian. If cb_arity is received from someone else, that someone should then also use tagged union instead of passing cb_arity and cb_func separately for better type safety.

Sidenote: C++ (as opposed to C) prohibits access to a non-active member of union. This should not affect correctness of the code, because calling a func0 through a pointer to func1 is UB by itself.

Upvotes: 0

Related Questions