Which lookup method is most efficient for simple integer to string lookup

Question

I need to lookup string identifiers in some C code and was thinking about how to code the lookup. The identifiers and strings are fixed at compile time and are not likely to change. I thought that an index into an array of strings would be the most efficient - that is lookup1.

Sometimes in the code the identifier does not start from 0 or there are gaps in the numbering so chose lookup2 for those cases. lookup2 uses a switch statement.

Another option is lookup3 which uses a struct with an integer to string mapping.

Some pros and cons I thought about.

lookup2 is more flexible if identifiers do not start from zero or if there are gaps

If identifiers start from zero and there are no gaps then lookup1 is better? If not then go for lookup2 method?

How about lookup3?

This is legacy code and the defines are already there. For new code, enums would be better for this?

Generally there would be say 5-20 defines in a category. There can be over 100.

Here is the code.

#include 

#define RINGING     0x0
#define DIALING     0x1
#define IDLE        0x2
#define ENGAGED     0x3
#define CONNECTED   0x4

static const char* const lookup1(int id) {
    static const char* const identifiers[] = {
        "RINGING",
        "DIALING",
        "IDLE",
        "ENGAGED",
        "CONNECTED" };

    int size = sizeof(identifiers) / sizeof(identifiers[0]);
    if (id >= 0 && id < size) {
        return identifiers[id];
    }
    return "Unknown identifier";
}


static const char* const lookup2(int id) {
    switch (id) {
    case RINGING: return "RINGING";
    case DIALING: return "DIALING";
    case IDLE: return "IDLE";
    case ENGAGED: return "ENGAGED";
    case CONNECTED: return "CONNECTED";
    default: return "unknown";
    }
}

static const char* const lookup3(int id) {
    struct id2name {
        int id;
        const char* const name;
    };

    static struct id2name pairings[] = {
        { RINGING, "RINGING" },
        { DIALING, "DIALING" },
        { IDLE, "IDLE" },
        { ENGAGED, "ENGAGED" },
        { CONNECTED, "CONNECTED" } };

    int size = sizeof(pairings) / sizeof(pairings[0]);
    if (id >= 0 && id < size) {
        return pairings[id].name;
    }

    return "Unknown identifier";
}


int main() {
    const int identifiers[] = { RINGING, DIALING, IDLE, ENGAGED, CONNECTED };
    const int size = sizeof(identifiers) / sizeof(identifiers[0]);
    for (int i = 0; i < size; ++i) {
        printf("using lookup1 id %d is: %s
", i, lookup1(i));
        printf("using lookup2 id %d is: %s
", i, lookup2(i));
        printf("using lookup3 id %d is: %s
", i, lookup3(i));
    }
}

John Bollinger · Accepted Answer

It's hard to beat a table lookup such as your lookup1() for clarity, concision, and speed. That's not to say, however, that other approaches might not compete at least on speed. For relative performance questions, you really need to benchmark.

If the maximum index number is large or if any of the index numbers are less than zero, or if you cannot rely on at least C99, then a direct array-based table lookup is problematic, but otherwise, gaps between the indexes are not a particular problem, including between the start of the array and the lowest index used. Consider this:

#define INITIALIZER(x) [x] = #x,

const char *lookup4(int x) {
    static const char * const table[] = {
        INITIALIZER(RINGING)
        INITIALIZER(DIALING)
        INITIALIZER(IDLE)
        INITIALIZER(ENGAGED)
        INITIALIZER(CONNECTED)
        // initializers have the form:
        // [MACRO] = "MACRO",
    };

    const char *result = ((x < 0 | x >= (sizeof(table) / sizeof(table[0]))) 
        ? NULL 
        : table[x];

    return result ? result : "unknown";
}

That uses designated initializers (produced by the INITIALIZER() macro) to initialize those elements of the lookup table that correspond to valid strings; the others will be NULL. This ends up being pretty closely analogous to your lookup1().

There's nothing particularly wrong with your lookup2(). It's clean and clear, and I imagine most compilers will produce very efficient code for it.

As lookup3() is presented, however, I don't see any reason to prefer it over any of the others. You don't use the message numbers, so why store them in your struct? Without them, however, you don't need a struct, so you basically have a more complex implementation of lookup1(). If you did use the numbers -- e.g. by performing a search through the array for the requested message number -- then that would be more expensive on average than the other approaches.

Which lookup method is most efficient for simple integer to string lookup

Answers (2)

Related Questions