Reputation: 2846
Assume that we have a C string
text = "0.4,0.1,-4.1#100,200,300#-32.13,23.41,100#<...>#20,25,30"
The goal is to split that string first with #
and then ,
because I'm after each values between #
and there are three separate values between ,
.
The string text
contains 17 elements of 3 numbers with separator ,
and 16 elements of #
I did try to solve this with this code.
char *min_max_bias_char;
float min_max_bias_float[3*17]; /* 3 values per each analog input channel */
for(uint8_t i = 0; i <= 16; i++) {
if(i == 0)
min_max_bias_char = strtok(text, DELIMITER);
else
min_max_bias_char = strtok(NULL, DELIMITER);
min_max_bias_float[0 + i*3] = atoff(strtok(min_max_bias_char, ",")); /* Min value */
min_max_bias_float[1 + i*3] = atoff(strtok(NULL, ",")); /* Max value */
min_max_bias_float[2 + i*3] = atoff(strtok(NULL, ",")); /* Bias value */
}
Where I first split the text string text
depending on #
and then I take the first index of min_max_bias_char
and split that on the delimiter ,
.
This did not work out very well because as soon I do strtok(min_max_bias_char)
then strtok
forget about the min_max_bias_char = strtok(NULL, DELIMITER);
statement.
Now I got the array min_max_bias_float
that holds the values inside of an array {0.4,0.1,-4.1,100,200,300,-32.13,23.41,100,<...>,20,25,30}
This is the output. So how can I solve this issue? I'm trying to split string twice.
Upvotes: 0
Views: 689
Reputation: 1765
You have useful tips in comments, and useful answers already.
Anyway, I will point you to use a state machine. It is a common and perhaps easy way to express this kind of problem.
In this example case it is a minimal one, with only 2 states.
Below is a complete C program, after some discussion :)
If I understand it correctly you have a number of fields, 3 doubles
in this case, separated by ,
and forming a group. And each group is surrounded or at least terminated by #
. The number of groups is not fixed.
It would be good to have a function that gets a line, parses it and get the values in some useful and ready to use way. So at first I would look to the data
typedef struct { double field[3]; } Group;
typedef struct
{
unsigned n_groups; // # of 3-doubles groups
unsigned n_incr; // size of increment block
int n_size; // # of pointers to Group. Error code is <0
Group* g; // the groups
} Set;
The Set
constains an array of Group
. Each Group
has the 3 doubles. The array should be created dynamically, since the number of groups is not known. The array is allocated in group of n_incr
, and the actual size is in kept in n_size
. Fairly common.
And it seems convenient, since you can iterate over the results with ease, or save them for future reference. See the code to show a set on-screen:
void print_set(Set* set)
{
printf("set: %d groups:\n", set->n_groups);
for (unsigned i = 0; i < set->n_groups; i += 1)
printf("%3d: %.2f, %.2f, %.2f\n", 1 + i,
set->g[i].field[0],
set->g[i].field[1],
set->g[i].field[2]);
};
That shows, for the line
"0.4,0.1,-4.1#100,200,300#-32.13,23.41,100#20,25,30",
after parse:
set: 4 groups:
1: 0.40, 0.10, -4.10
2: 100.00, 200.00, 300.00
3: -32.13, 23.41, 100.00
4: 20.00, 25.00, 30.00
int parse(const char*,Set*); // parse string into set
You pass a string and a Set
as above and get in the set the parsed arguments, and a 0 return code in case of success.
To make it easier, since it is an example, the program uses these functions
Set* build_set(unsigned);
Set* free_set(Set*);
Set* insert(Group*, Set*); // insert group into set
int parse(const char*,Set*); // parse string into set
void print_set(Set*);
with the (I believe) obvious effects. The parameter in build_set()
is the size of the block of parameters to be created and of each extension if needed.
free_set()
releases memory in the correct order, insert()
inserts a group into the result set, print_set()
shows them on-screen, and parse()
is the actual parser.
main()
for a testThe example code takes an array of strings and parses them all, using the functions above:
int main(void)
{
// a few tests
const char* test[] = {
"0.4,0.1,-4.1#100,200,300#-32.13,23.41,100#20,25,30",
"#0.4,0.1,-4.1#100,200,300#-32.13,23.41,100#20,25,30#",
"1.1,-2.2,3.3",
"#1,2,3,4#", NULL};
// parse all tests
for (int i = 0; test[i] != NULL; i += 1)
{
printf("About to parse \"%s\"\n", test[i]);
Set* values = build_set(10);
int res = parse(test[i], values);
printf("\nparse() returned %d, found %d groups\n",
res, values->n_groups);
print_set(values);
values = free_set(values);
printf("\n\tAnswer set free()'d\n\n");
}; // for()
return 0;
}
The logic is simple: for each line:
You can edit the array test[]
and try another sets. Just keep the NULL
at the end. The strings in the tests are from your code, in fact, and an invalid line with 4 doubles at the end.
About to parse "0.4,0.1,-4.1#100,200,300#-32.13,23.41,100#20,25,30"
parse() returned 0, found 4 groups
set: 4 groups:
1: 0.40, 0.10, -4.10
2: 100.00, 200.00, 300.00
3: -32.13, 23.41, 100.00
4: 20.00, 25.00, 30.00
Answer set free()'d
About to parse "#0.4,0.1,-4.1#100,200,300#-32.13,23.41,100#20,25,30#"
parse() returned 0, found 4 groups
set: 4 groups:
1: 0.40, 0.10, -4.10
2: 100.00, 200.00, 300.00
3: -32.13, 23.41, 100.00
4: 20.00, 25.00, 30.00
Answer set free()'d
About to parse "1.1,-2.2,3.3"
parse() returned 0, found 1 groups
set: 1 groups:
1: 1.10, -2.20, 3.30
Answer set free()'d
About to parse "#1,2,3,4#"
parse() returned -4, found 0 groups
set: 0 groups:
Answer set free()'d
#define ST_INIT 0
#define ST_INFIELD 1
#include <ctype.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef struct { double field[3]; } Group;
typedef struct
{
unsigned n_groups; // # of 3-doubles groups
unsigned n_incr; // size of increment block
int n_size; // # of pointers to Group. Error code is <0
Group* g; // the groups
} Set;
Set* build_set(unsigned);
Set* free_set(Set*);
Set* insert(Group*, Set*); // insert group into set
int parse(const char*,Set*); // parse string into set
void print_set(Set*);
int main(void)
{
// a few tests
const char* test[] = {
"0.4,0.1,-4.1#100,200,300#-32.13,23.41,100#20,25,30",
"#0.4,0.1,-4.1#100,200,300#-32.13,23.41,100#20,25,30#",
"1.1,-2.2,3.3",
"#1,2,3,4#", NULL};
// parse all tests
for (int i = 0; test[i] != NULL; i += 1)
{
printf("About to parse \"%s\"\n", test[i]);
Set* values = build_set(10);
int res = parse(test[i], values);
printf("\nparse() returned %d, found %d groups\n",
res, values->n_groups);
print_set(values);
values = free_set(values);
printf("\n\tAnswer set free()'d\n\n");
}; // for()
return 0;
}
Set* build_set(unsigned block)
{ // block is # of groups
// allocated each time
Set* set = (Set*)malloc(sizeof(Set));
set->n_groups = 0;
set->n_incr = block;
set->n_size = block;
set->g = (Group*)malloc(block * sizeof(Group));
return set;
}
Set* free_set(Set* set)
{
if (set == NULL) return NULL;
free(set->g);
free(set);
return NULL;
};
Set* insert(Group* g, Set* s)
{
// check for need of extension
if (s->n_groups >= (unsigned)s->n_size)
{ // Set if full: adds 1 block
unsigned sz = s->n_size + s->n_incr;
Group* temp = (Group*)realloc( s->g, sz * sizeof(Group));
if (temp == NULL) return NULL;
s->g = temp; // extended
s->n_size = sz;
}; // if()
s->g[s->n_groups].field[0] = g->field[0];
s->g[s->n_groups].field[1] = g->field[1];
s->g[s->n_groups].field[2] = g->field[2];
s->n_groups += 1;
return s;
};
int parse(const char* text, Set* set)
{
if (text == NULL) return -1;
char line[30];
char state = ST_INIT;
unsigned ix = 0;
unsigned i_f = 0; // inside field
unsigned n_f = 0; // # of fields in the group
Group grp;
while (1)
{
switch (state)
{
case ST_INIT:
switch (text[ix])
{
case 0:
return -2; // empty
break;
case ',':
return -30;
break;
case '#': // start at #
state = ST_INFIELD;
break;
default:
line[i_f++] = text[ix];
state = ST_INFIELD;
break;
}; // switch()
ix += 1;
case ST_INFIELD:
switch (text[ix])
{
case 0: // end of text: should have 0 or 3 fields
if (i_f == 0) return 0; // normal end
if (n_f != 2) return -3;
line[i_f] = 0; // terminate string
grp.field[n_f] = atof(line);
//printf("Field: %d, from \"%s\" = %f\n", n_f,
// line, grp.field[n_f]);
insert(&grp, set);
return 0;
break;
case ',': // end of field
if (n_f > 1) return -4; // misplaced
// must have 3 fields
line[i_f] = 0;
grp.field[n_f] = atof(line);
//printf("Field: %d, from \"%s\" = %f\n", n_f,
// line, grp.field[n_f]);
n_f += 1;
i_f = 0;
if (n_f == 3)
{
insert(&grp, set);
n_f = 0;
i_f = 0;
}
break;
case '#': // group terminator #
if (n_f != 2) return -5; // must have 3 fields
line[i_f] = 0; // terminate string
grp.field[n_f] = atof(line);
//printf("Field: %d, from \"%s\" = %f\n", n_f,
// line, grp.field[n_f]);
n_f += 1;
i_f = 0;
if (n_f == 3)
{
n_f = 0;
i_f = 0;
insert(&grp, set);
}
break;
default:
line[i_f++] = text[ix];
break;
}; // switch()
ix += 1;
}; // switch()
}; // while()
return 0;
}
void print_set(Set* set)
{
printf("set: %d groups:\n", set->n_groups);
for (unsigned i = 0; i < set->n_groups; i += 1)
printf("%3d: %.2f, %.2f, %.2f\n", 1 + i,
set->g[i].field[0],
set->g[i].field[1],
set->g[i].field[2]);
};
/*
https://stackoverflow.com/questions/68584131/
how-can-i-split-a-c-string-twice-with-strtok-in-c
*/
Upvotes: 1
Reputation: 56965
strtok
accepts multiple delimiters, and since your data structure seems to not care whether the current element is a ','
or a '#'
character (in other words, you're not building a 2d structure requiring nested looping), you can just provide a delimiter string and make one call to strtok
in the loop.
Here's a minimal example you can adapt to your environment:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void) {
char delimiters[] = "#,";
char text[] = "0.4,0.1,-4.1#100,200,300#-32.13,23.41,100#20,25,30";
int size = 3 * 4; // or 3 * 17;
float res[size];
res[0] = atof(strtok(text, delimiters));
for (int i = 1; i < size; i++) {
res[i] = atof(strtok(NULL, delimiters));
}
for (int i = 0; i < size; i++) {
printf("%.2f ", res[i]);
}
puts("");
return 0;
}
Output:
0.40 0.10 -4.10 100.00 200.00 300.00 -32.13 23.41 100.00 20.00 25.00 30.00
It's a good idea to check the return value of strtok
in the above code.
If you want to avoid strtok
(there are good reasons to), there's strtok_r
or write it by hand with a loop:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void) {
char delimiters[] = "#,";
char text[] = "0.4,0.1,-4.1#100,200,300#-32.13,23.41,100#20,25,30";
int size = 3 * 4; // or 3 * 17;
float res[size];
int res_size = 0;
int last_index = 0;
for (int i = 0, len = strlen(text); i < len; i++) {
if (!strchr(delimiters, text[i])) {
continue;
}
else if (i - last_index >= 32 || res_size >= size) {
fprintf(stderr, "buffer size exceeded\n");
return 1;
}
char buf[32] = {0};
strncpy(buf, text + last_index, i - last_index);
res[res_size++] = atof(buf);
last_index = i + 1;
}
for (int i = 0; i < res_size; i++) {
printf("%.2f ", res[i]);
}
puts("");
return 0;
}
Upvotes: 2
Reputation: 2846
This works, thanks to strtok_r
/* Collect */
char *min_max_bias_char;
char *text_pointer = text;
float min_max_bias_float[3*17]; /* 3 values per each analog input channel */
for(uint8_t i = 0; i <= 16; i++) {
if(i == 0)
min_max_bias_char = strtok_r(text, DELIMITER, &text_pointer);
else
min_max_bias_char = strtok_r(NULL, DELIMITER, &text_pointer);
min_max_bias_float[0 + i*3] = atoff(strtok(min_max_bias_char, ",")); /* Min value */
min_max_bias_float[1 + i*3] = atoff(strtok(NULL, ",")); /* Max value */
min_max_bias_float[2 + i*3] = atoff(strtok(NULL, ",")); /* Bias value */
}
Upvotes: 0
Reputation: 781096
You don't need nested uses of strtok()
. Just alternate your delimiters: 2 commas followed by 1 hash each time through the main loop.
char *curptr = text;
for(uint8_t i = 0; i < 17; i++) {
min_max_bias_float[0 + i*3] = atoff(strtok(curptr, ","));
min_max_bias_float[1 + i*3] = atoff(strtok(NULL, ","));
min_max_bias_float[2 + i*3] = atoff(strtok(NULL, DELIMITER));
curptr = NULL; // so subsequent loops will continue using the same string
}
Upvotes: 2