Reputation: 6858
Both GCC and Clang have a support to make compile-time checks on variable argument functions like printf
. These compilers accept syntax like:
extern void dprintf(int dlevel, const char *format, ...)
__attribute__((format(printf, 2, 3))); /* 2=format 3=params */
On OSX, the Cocoa framework also use an extension of this for NSString
:
#define NS_FORMAT_FUNCTION(F,A) __attribute__((format(__NSString__, F, A)))
In our company, we have a custom C++ framework with a bunch of classes like BaseString
all deriving from BaseObject
. In BaseString
there are a few variable argument methods similar to sprintf
, but with some extensions. For example, "%S"
expects an argument of type BaseString*
, and "%@"
expects a BaseObject*
argument.
I would like to perform a compile-time check of the arguments in our projects, but because of the extensions, __attribute__((format(printf)))
give lots of false positive warnings.
Is there a way to customize the support of __attribute__((format))
for one of the two compilers ? If this requires a patch to the compiler source, is it doable in a reasonable amount of time ? Alternatively, are there other lint like tools that could perform the check ?
Upvotes: 7
Views: 3376
Reputation: 1
With recent version of GCC (I recommend 4.7 or newer, but you could try with a GCC 4.6) you can add your own variables and functions attributes thru a GCC plugin (with the PLUGIN_ATTRIBUTES
hook), or a MELT extension.
MELT is a domain specific language to extend GCC (implemented as a [meta-]plugin).
If using a plugin (e.g. MELT) you won't need to recompile the source code of GCC. But you need a plugin-enabled GCC (check with gcc -v
).
In 2020, MELT is not updated any more (because of lack of funding); however you could write your own GCC plugin for GCC 10 in C++, doing such checks.
Some Linux distributions don't enable plugins in their gcc
- please complain to your distribution vendor; others provide a package for GCC plugin development, e.g. gcc-4.7-plugin-dev
for Debian or Ubuntu.
Upvotes: 5
Reputation: 2398
With c++11, it is possible to solve this problem by replacing __attribute__ ((format))
with a clever combination of constexpr
, decltype
, and variadic parameter packs. Pass the format string into a constexpr
function that extracts out all the %
specifiers at compile time, and validate that the n'th specifier matches the decltype
of the (n+1)'st argument.
Here is a sketch of the solution...
If you have:
int x = 3;
Foo foo;
my_printf("%d %Q\n", x, foo);
You will need a macro wrapper for my_printf
, using the trick described here, to get something like this:
#define my_printf(fmt, ...) \
{ \
static_assert(FmtValidator<decltype(makeTypeHolder(__VA_ARGS__))>::check(fmt), \
"one or more format specifiers do not match their arguments"); \
my_printf_impl(fmt, ## __VA_ARGS__); \
}
You'll need to write FmtValidator
and makeTypeHolder()
.
makeTypeHolder
will look something like this:
template<typename... Ts> struct TypeHolder {};
template<typename... Ts>
TypeHolder<Ts...> makeTypeHolder(const Ts&... args)
{
return TypeHolder<Ts...>();
}
Its purpose is to create a type uniquely determined by the types of the arguments passed into my_printf()
. The FmtValidator
then needs to validate that these types are consistent with the %
specifiers found in fmt
.
Next, FmtValidator<T>::check()
needs to written to extract the %
specifiers at compile time (i.e., as a constexpr
function). This require some compile-time recursion and looks like this:
template<typename... Ts>
struct FmtValidator;
// recursion base case
template<>
struct FmtValidator<>
{
static constexpr bool check(const char* fmt)
{
return *fmt == '\0' ? true :
*fmt != '%' ? check(fmt + 1) :
fmt[1] == '%' ? check(fmt + 2) : false;
}
};
// recursion
template<typename T, typename... Ts>
struct FmtValidator<TypeHolder<T, Ts...>>
{
static constexpr bool check(const char* fmt)
{
// find the first % specifier in fmt, validate it against T,
// and then recursively dispatch with Ts... and the remainder of fmt
...
}
};
The validation of individual types against individual %
specifiers, you can do with something like this:
template<>
struct specmatch<int>
{
static constexpr bool match(const char* c, const char* cend)
{
return strmatches(c, cend, "d") ||
strmatches(c, cend, "i");
}
};
// add other specmatch specializations for float, const char*, etc.
And then, you are free to write your own validators with your own custom types.
Upvotes: 2
Reputation: 6858
One year and a half after having asked this question, I came out with a totally different approach to solve the real problem: Is there any way to statically check the types of custom variadic formatting statements?
For completeness and because it can help other people, here is the solution I have finally implemented. It has two advantages over the original question:
A Perl script parses the source code, finds the formatting strings and decodes the percent modifiers inside them. It then wraps all arguments with a call to a template identity function CheckFormat<>
. Example:
str->appendFormat("%hhu items (%.2f %%) from %S processed",
nbItems,
nbItems * 100. / totalItems,
subject);
Becomes:
str->appendFormat("%hhu items (%.2f %%) from %S processed",
CheckFormat<CFL::u, CFM::hh>(nbItems ),
CheckFormat<CFL::f, CFM::_>(nbItems * 100. / totalItems ),
CheckFormat<CFL::S, CFM::_, const BaseString*>(subject ));
The enumerations CFL
, CFM
and the template function CheckFormat
must be defined in a common header file like this (this is an extract, there are around 24 overloads).
enum class CFL
{
c, d, i=d, star=i, u, o=u, x=u, X=u, f, F=f, e=f, E=f, g=f, G=f, p, s, S, P=S, at
};
enum class CFM
{
hh, h, l, z, ll, L=ll, _
};
template<CFL letter, CFM modifier, typename T> inline T CheckFormat(T value) { CFL test= value; (void)test; return value; }
template<> inline const BaseString* CheckFormat<CFL::S, CFM::_, const BaseString*>(const BaseString* value) { return value; }
template<> inline const BaseObject* CheckFormat<CFL::at, CFM::_, const BaseObject*>(const BaseObject* value) { return value; }
template<> inline const char* CheckFormat<CFL::s, CFM::_, const char*>(const char* value) { return value; }
template<> inline const void* CheckFormat<CFL::p, CFM::_, const void*>(const void* value) { return value; }
template<> inline char CheckFormat<CFL::c, CFM::_, char>(char value) { return value; }
template<> inline double CheckFormat<CFL::f, CFM::_, double>(double value) { return value; }
template<> inline float CheckFormat<CFL::f, CFM::_, float>(float value) { return value; }
template<> inline int CheckFormat<CFL::d, CFM::_, int>(int value) { return value; }
...
After having the compilation errors, it is easy to recover the original form with a regular expression CheckFormat<[^<]*>\((.*?) \)
replaced by its capture.
Upvotes: 2
Reputation: 157334
It's doable, but it's certainly not easy; part of the problem is that BaseString
and BaseObject
are user-defined types, so you need to define the format specifiers dynamically. Fortunately gcc at least has support for this, but would still require patching the compiler.
The magic is in the handle_format_attribute
function in gcc/c-family/c-format.c
, which calls initialization functions for format specifiers that refer to user-defined types. A good example to base your support on would be the gcc_gfc
format type, because it defines a format specifier %L
for locus *
:
/* This will require a "locus" at runtime. */
{ "L", 0, STD_C89, { T89_V, BADLEN, BADLEN, BADLEN, BADLEN, BADLEN, BADLEN, BADLEN, BADLEN }, "", "R", NULL },
Obviously though you'd want to base your format_char_info
array on print_char_table
, as that defines the standard printf
specifiers; gcc_gfc
is substantially cut down in comparison.
The patch that added gcc_gfc
is http://gcc.gnu.org/ml/fortran/2005-07/msg00018.html; it should be fairly obvious from that patch how and where you'd need to make your additions.
Upvotes: 2