Reputation: 1442
I am trying to split a char array in C using strtok. I have this working at the moment, but i have now realised that when there is two consecutive delimiters the who concept gets offset.
I am parsing the char array into a structure (i cannot post the exact code because it is for an assignment, but i will post similar code with assignment specifics changed) based on thier index, so e.g.
struct test_struct{
int index_1;
int index_2;
int index_3;
int index_4;
int index_5;
}test_struct;
I use a counter to populate this information, so every time a delimiter is reached increment this counter and assign data to this index, e.g:
char c_array[50] = "hello,this,is,an,example"
counter = 0;
token = strtok (c_array,",");
while (token != NULL) {
switch(counter){
case 0:
test_struct.index_1 = token;
break;
case 1:
test_struct.index_2 = token;
break;
//repeat this step for the other indexes
}
counter++;
token = strtok (NULL, ",");
}
I know case switch is probably a poor design choice in this situation, but aside from that can somebody help me find a solution to this problem:
The problem is, when a char array (C string basically) contains consecutive delimiters, then the token "skips" this index, thus throwing everything out of line. take the above example
if the char array is formatted properly, then when case 5 hits, it will have representing the 5th "spit string" so for the above example, when counter == 5 test_struct.index_5 will have the value "example".
Now, if given the above code if the c_array[50] = "hello,this,,an,example"
then the problem would be that after there is missing data now in the array so this messes up the indexing, it will "skip" the next index because ,,
doesn't have any "string" inbetween them so instead of the intended behaviour i get this:
test_struct.index_1 = "hello"
test_struct.index_2 = "this"
test_struct.index_3 = "an"
test_struct.index_4 = "example"
test_struct.index_5 = "example"
So is there a way to say if there is a ""
then set the token to a default value, e.g. "missing data" so at least then i can handle that separately after i have read in my data to the correct indexes.
I hope you understand what i mean.
Cheers, Chris.
Upvotes: 0
Views: 1566
Reputation: 755026
NB: this code still modifies the input string, but recognizes empty tokens quite happily.
#include <stdio.h>
#include <string.h>
static void split(char *string)
{
enum { MAX_STRINGS = 5 };
struct test_struct
{
char *index[MAX_STRINGS];
} test_struct;
printf("Splitting: [%s]\n", string);
int i = 0;
char *bgn = string;
char *end;
while (i < MAX_STRINGS && (end = strpbrk(bgn, ",")) != 0)
{
test_struct.index[i++] = bgn;
*end = '\0';
bgn = end + 1;
}
if (i >= MAX_STRINGS)
fprintf(stderr, "Too many strings!\n");
else
test_struct.index[i++] = bgn;
for (int j = 0; j < i; j++)
printf("index[%d] = [%s]\n", j, test_struct.index[j]);
}
int main(void)
{
char c_array[][30] =
{
"hello,this,is,an,example",
"hello,this,,an,example",
"hello,,bad,,example,input",
"hello,world",
",,,,",
",,",
"",
};
enum { C_SIZE = sizeof(c_array) / sizeof(c_array[0]) };
for (int i = 0; i < C_SIZE; i++)
split(c_array[i]);
return 0;
}
Splitting: [hello,this,is,an,example]
index[0] = [hello]
index[1] = [this]
index[2] = [is]
index[3] = [an]
index[4] = [example]
Splitting: [hello,this,,an,example]
index[0] = [hello]
index[1] = [this]
index[2] = []
index[3] = [an]
index[4] = [example]
Splitting: [hello,,bad,,example,input]
Too many strings!
index[0] = [hello]
index[1] = []
index[2] = [bad]
index[3] = []
index[4] = [example]
Splitting: [hello,world]
index[0] = [hello]
index[1] = [world]
Splitting: [,,,,]
index[0] = []
index[1] = []
index[2] = []
index[3] = []
index[4] = []
Splitting: [,,]
index[0] = []
index[1] = []
index[2] = []
Splitting: []
index[0] = []
Upvotes: 2