Tim Haggard
Tim Haggard

Reputation: 105

sscanf variable length string parsing

I have a problem that is probably fairly common and likely has a beautiful hack that I am not aware of. I would greatly appreciate it if someone would enlighten me!

I am using C's sscanf() function to parse input and the format is "%d %d %d %s %d %s %d ..." where the first two %d are random ID integers (insignificant) for the string and the third is a count of the number of %d %s combinations to follow.

For instance, "12 34 2 3 yes 2 no" could be a string, where 12 and 34 random are ID's (unimportant to the problem) and 2 specifies the two combinations following of '3 yes' and '2 no'. The 3 preceding 'yes' specifies the length of the string following, and the same is true for the 'no' with a 2 before it. Where we can have a variable number of these combinations following and we want to catch them all with the sscanf.

Does anyone know of any way to do this with sscanf?

Thanks a lot!

Upvotes: 4

Views: 5576

Answers (4)

nneonneo
nneonneo

Reputation: 179422

Just parse the string in two (or more) passes. This uses the %n format specifier to write the number of bytes processed, so we know where to pick up in subsequent passes:

int a, b, n, pos;
const char *buf = "12 34 2 3 yes 2 no";

assert(sscanf(buf, "%d %d %d %n", &a, &b, &n, &pos) == 3);
for(int i=0; i<n; i++) {
    int cur;

    int x;
    char y[20];

    assert(sscanf(buf+pos, "%d %19s %n", &x, y, &cur) == 2);
    printf("%d %s\n", x, y);
    pos += cur;
}

outputs

3 yes
2 no

Upvotes: 4

user1896333
user1896333

Reputation: 1

Is there a maximum value for the number of %d %s pairs that follow the initial "%d %d %d" header? There must be since you're putting the values somewhere, perhaps in a struct { int i; char s[j]; } a[n]; with appropriate j and n sizes.

(BTW, your example uses "%d %d %d %s %d %s %d ..." when the description indicates it should be "%d %d %d %d %s %d %s %d %s ...", 3 decimals, followed by decimal/string pairs)

If there is a maximum just create a maximized template and test that the return code from sscanf, which should indicate the number of input items successfully matched and assigned, is your 3rd value minus the 3 header items. If the return code isn't correct when compared to the 3rd int, report a malformed line.

I once made something like this a function so the a[n] was automatic ion the stack and then the function allocated a linked list from the heap for the items sscanf-ed and returned a pointer to the first item.

Upvotes: 0

Sheng
Sheng

Reputation: 3555

First of all, ssprintf() is used to generate string, not parse it. You should use sscanf. I do not know hwo to finish it in one sscanf(). But you can do it in a loop as:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(){
    int choice_id, count;
    char choice[20],id_1[20],id_2[20], count_str[20], choice_id_str[20];
    int index;
    char *input = "12 34 2 3 yes 2 no";

    sscanf(input, "%s %s %s", id_1, id_2, count_str);
    input += strlen(id_1)+strlen(id_2)+strlen(count_str)+2;
    count = atoi(count_str);
    for(index = 0; index< count; ++index){
        sscanf(input, " %s %s", choice_id_str, choice );
        choice_id = atoi(choice_id_str);
        // Process or store the record
        printf("%d: %s\n",choice_id, choice);
        input += strlen(choice_id_str) + strlen(choice) + 2;
    }
    return 0;
}

Compile with gcc (GCC) 4.1.2, and run with Linux. The output is:

-bash-3.2$ ./a.out
3: yes
2: no

Upvotes: 0

Charles Salvia
Charles Salvia

Reputation: 53289

There's no convenient way to do this with just sscanf. You'd need to dynamically generate the format string itself, before passing it to sscanf.

You might want to consider writing a specialized parsing routine for this where you call sscanf in a loop, or more preferably (since you specify the C++ tag) using an std::istringstream in a loop.

Upvotes: 1

Related Questions