user2891462
user2891462

Reputation: 3333

Format strings safely when vsnprintf is not available

I am writing code that needs to format a string, and I want to avoid buffer overruns.

I know that if vsnprintf is available (C99 onwards) we can do:

char* formatString(const char *format, ...)
{
    char* result = NULL;
    va_list ap;
    va_start(ap, format);

    /* Get the size of the formatted string by getting vsnprintf return the
     * number of remaining characters if we ask it to write 0 characters */
    int size = vsnprintf(NULL, 0, format, ap);

    if (size > 0)
    {
        /* String formatted just fine */
        result = (char *) calloc(size + 1, sizeof(char));
        vsnprintf(result, size + 1, format, ap);
    }

    va_end(ap);
    return result;
}

I can't figure out a way of doing something similar in C90 (without vsnprintf). If it turns out to not be possible without writing extremely complex logic I'd be happy to set a maximum length for the result, but I'm not sure how that could be achieved either without risking a buffer overrun.

Upvotes: 4

Views: 2440

Answers (2)

Jonathan Leffler
Jonathan Leffler

Reputation: 753385

Transferring comments to answer.

The main reason vsnprintf() was added to C99 was that it is hard to protect vsprintf() or similar. One workaround is to open /dev/null, use vfprintf() to format the data to it, note how big a result was needed, and then decide whether it is safe to proceed. Icky, especially if you open the device on each call.

That means your code might become:

#include <assert.h>
#include <stdarg.h>
#include <stdio.h>
#include <stdlib.h>

extern char *formatString(const char *format, ...);

char *formatString(const char *format, ...)
{
    static FILE *fp_null = NULL;

    if (fp_null == NULL)
    {
        fp_null = fopen("/dev/null", "w");
        if (fp_null == NULL)
            return NULL;
    }

    va_list ap;

    va_start(ap, format);
    int size = vfprintf(fp_null, format, ap);
    va_end(ap);

    if (size < 0)
        return NULL;

    char *result = (char *) malloc(size + 1);
    if (result == NULL)
        return NULL;

    va_start(ap, format);
    int check = vsprintf(result, format, ap);
    va_end(ap);

    assert(check == size);

    return result;
}

int main(void)
{
    char *r1 = formatString("%d Dancing Pigs = %4.2f%% of annual GDP (grandiose dancing pigs!)\n",
                            34241562, 21.2963);
    char *r2 = formatString("%s [%-13.10s] %s is %d%% %s\n", "Peripheral",
                            "sub-atomic hyperdrive", "status", 99, "of normality");

    if (r1 != NULL)
        printf("r1 = %s", r1);

    if (r2 != NULL)
        printf("r2 = %s", r2);

    free(r1);
    free(r2);
    return 0;
}

As written with fp_null a static variable inside the function, the file stream cannot be closed. If that's a bother, make it a variable inside the file and provide a function to if (fp_null != NULL) { fclose(fp_null); fp_null = NULL; }.

I'm unapologetically assuming a Unix-like environment with /dev/null; you can translate that to NUL: if you're working on Windows.

Note that the original code in the question did not use va_start() and va_end() twice (unlike this code); that would lead to disaster. In my opinion, it is a good idea to put the va_end() as soon after the va_start() as possible — as shown in this code. Clearly, if your function is itself stepping through the va_list, then there will be a bigger gap than shown here, but when you're simply relaying the variable arguments to another function as here, there should be just the one line in between.

The code compiles cleanly on a Mac running macOS 10.14 Mojave using GCC 8.2.0 (compiled on macOS 10.13 High Sierra) with the command line:

$ gcc -O3 -g -std=c90 -Wall -Wextra -Werror -Wmissing-prototypes \
>     -Wstrict-prototypes vsnp37.c -o vsnp37
$

When run, it produces:

r1 = 34241562 Dancing Pigs = 21.30% of annual GDP (grandiose dancing pigs!)
r2 = Peripheral [sub-atomic   ] status is 99% of normality

Upvotes: 2

chux
chux

Reputation: 153283

Pre-C99 affords no simply solution to format strings with a high degree of safety of preventing buffer overruns.

It is those pesky "%s", "%[]", "%f" format specifiers that require so much careful consideration with their potential long output. Thus the need for such a function. @Jonathan Leffler

To do so with those early compilers obliges code to analyze format and the arguments to find the required size. At that point, code is nearly there to making you own complete my_vsnprintf(). I'd seek existing solutions for that. @user694733.


Even with C99, there are environmental limits for *printf().

The number of characters that can be produced by any single conversion shall be at least 4095. C11dr §7.21.6.1 15

So any code that tries to char buf[10000]; snprintf(buf, sizeof buf, "%s", long_string); risks problems even with a sufficient buf[] yet with strlen(long_string) > 4095.

This implies that a quick and dirty code could count the % and the format length and make the reasonable assumption that the size needed does not exceed:

size_t sz = 4095*percent_count + strlen(format) + 1;

Of course further analysis of the specifiers could lead to a more conservative sz. Continuing down this path we end at writing our own my_vsnprintf().


Even with your own my_vsnprintf() the safety is only so good. There is no run-time check that the format (which may be dynamic) matches the following arguments. To do so requires a new approach.

Cheeky self advertisement for a C99 solution to insure matching specifiers and arguments: Formatted print without the need to specify type matching specifiers using _Generic.

Upvotes: 2

Related Questions