user2209254
user2209254

Reputation: 63

Storing key-value pairs in plain C

I'm trying to find a way of storing "key, value" pairs in C in an efficient manner for quick data retrieval. I've been looking online and there doesn't seem to be a quick and easy way of storing them such as in Java. I'll need to be able to access and update the value frequently and also being able to add new keys in and sort them into order. I've read about using qsort() and bsearch() to accomplish those, but I'm not sure what data structure to use to store it all.

Upvotes: 5

Views: 8725

Answers (4)

Paul Schutte
Paul Schutte

Reputation: 386

A very fast and memory efficient way is to use Judy arrays. It is easy to use as long as you are not scared of pointers.

http://judy.sourceforge.net/

It is licensed under the LGPL

Can be installed on Debian/Ubuntu with: sudo apt-get install libjudy-dev

One caveat is that a word is the length of the native CPU word. This makes it fast, but might have portability implications between 32/64 bit machines when using Judy1 or JudyL.

The following types are available:

Judy1  - maps an Index (word) to a bit
JudyL  - maps an Index (word) to a Value (word/pointer)
JudySL - maps an Index (null terminated string) to a Value
JudyHS - maps an Index (array-of-bytes) of Length to a Value

Sample code using strings as keys (JudySL):

#include <stdio.h>
#include <Judy.h>

#define DIE(x) { fprintf(stderr,"%s\n",x); exit(-1); }

int main() {

    Pvoid_t   PJArray = (PWord_t)NULL;  // Judy array.
    PWord_t   PValue;                   // Judy array element.
    Word_t    Bytes;                    // size of JudySL array.

    uint8_t   key[100]; //max len for key is 100
    const char *value1="Value One";
    const char *value2="Value Two";

    JSLI(PValue, PJArray, "key1");          // Insert key
    if (PValue == PJERR) DIE("Out of memory\n");
    *PValue=(Word_t)value1;                 // Set pointer to value

    JSLI(PValue, PJArray, "key2");          // Insert key
    if (PValue == PJERR) DIE("Out of memory\n");
    *PValue=(Word_t)value2;                 // Set pointer to value

    key[0]='\0';                            // start with smallest string.

    JSLF(PValue, PJArray, key);             // get first key/value
    while (PValue != NULL) {
            printf("key=%s, value=%s\n",key,(char*)*PValue);
            JSLN(PValue, PJArray, key);     // get next key/value
    }

    JSLG(PValue, PJArray, "key2");             // lookup a key
    printf("key2:%s\n",(char*)*PValue);

    JSLFA(Bytes, PJArray);              // free array

    return 0;
}

compile with: gcc judy_sample.c -o judy_sample -lJudy

Upvotes: 0

Daniel Glasser
Daniel Glasser

Reputation: 161

I realize this is an old thread, but I may have something to contribute that might be useful to others looking for a solution that isn't very complex.

I've done this several times in different ways. How it's done depends on a few factors:

  1. Do you know the maximum number of key/value pairs you will need to track?
  2. Are all the values of the same type?
  3. How fast does this need to be?

If the answers to 1 and 2 are "yes", then it can be quite straight forward. When the answer to 3 is "doesn't matter that much", or when the maximum number of pairs is not too high, I use arrays or blocks of dynamically allocated memory treated as arrays.

In this scheme, there are two arrays: - an Index array (not the key) - an array of key/value pair structs containing the key name and the value

You also have a structure that tracks the key/value list that contains (minimally) pointers to the index and key/value structure arrays, the number of key/value pairs currently defined, and the maximum number of key/value pairs that can be stored.

Initially, the count of key/value pairs is 0, every array element in the index array contains an initial value (can be zero, but is usually something that indicates it's not in use, like -1), and all of the elements of the key/value pair struct array are zeroed out (no name, no value).

The index array is maintained so that the index values reference the key/value pair structures in the other array in the correct order. Insertions and deletions do not move any existing pair structures around, just the indexes. When a key/value pair is deleted, zero out the struct that contained it.

When using "qsort()" or its brethren, your compare function uses the indexes in the index array to access the names of the corresponding key/value pairs, and your exchange function swaps the index values in the index array. Insert does an overlapped in-place copy (from end to insertion point) to shuffle the indexes of the keys that come after the new key down one position in the index array, and deletion does a similar upward shuffle to close the gap where the deleted key was.

A slightly faster version of this that uses no more memory for storage uses a C union to allow a forward chain index to be stored in unused key/value pair elements, and initialization chains them all together with a "next free" index in the list context. This prevents having to search through the list for free elements when inserting a new pair. When you need a free key/value pair object, use the index stored in "next free" for the new element, and set "next free" to the stored chain index in the free object just claimed. When you discard a pair, simply copy the "next free" value into the chain index in the freed object and set the index of the freed object as the new value of "next free".

The index array may also be implemented using pointers to the key/value structures in memory. In this case, the "next free" and chain links in the free object list become pointers as well.

The above scheme works well for small key/value set sizes and simple value types.

Upvotes: 3

CuriousSid
CuriousSid

Reputation: 544

As Baltasarq said, C doesn't have a data structure for this purpose. However you may use an implementation based on struct's that must support: initialisation, get, add and delete operations. Some good designs are proposed here.

Upvotes: 1

Baltasarq
Baltasarq

Reputation: 12212

You are looking for an associative container. There is no "direct" way in C, since the standard library does not provide any data structure. You can try to look for a third party library that provides the functionality, or roll your own solution.

Upvotes: 3

Related Questions