Reputation: 408
So I'm trying to implement a token parser that doesn't use any C library functions like strtok() etc but I'm having a few issues with access violations and after reading several similar questions on here still haven't got it nailed down. Anyone willing to offer some pointers?
int main(int argc, char* argv[])
{
int maxTokens = 10;
char* tokens[10];
int i;
for(i = 0; i < maxTokens; i++)
{
tokens[i] = NULL;
}
char* str = "This,is,a,test,string";
int result = parseLine(str, ',', tokens, maxTokens);
printf("%d tokens were found!", result);
system("PAUSE");
return 0;
}
int parseLine(char* str, char delimeter, char* tokens[], int maxTokens)
{
char* srcStr = str;
int strlen = 0;
int tokenCount = 0;
if(srcStr[strlen] != delimeter && srcStr[strlen] != '\0')
{
tokens[tokenCount] = (char*) malloc(sizeof(char)*strlen+1);
tokens[tokenCount] = &srcStr[strlen];
tokenCount++;
}
while(srcStr[strlen] != '\0')
{
if(srcStr[strlen] == delimeter)
{
tokens[tokenCount-1][strlen] = '\0';
if(srcStr[strlen+1] != '\0')
{
tokens[tokenCount] = (char*) malloc(sizeof(char)*strlen+1);
tokens[tokenCount] = &srcStr[++strlen];
tokenCount++;
}
}
else
{
strlen++;
}
}
return tokenCount;
}
Upvotes: 2
Views: 1144
Reputation: 27105
"Anyone willing to offer some pointers?"
In all seriousness, though, consider using a debugger (I recommend Visual Studio if you're using Windows), or Valgrind (Linux only) to catch access violations.
Even without reading your code at all, I was able to get useful information about the line number where the segfault occurs:
==8272== Memcheck, a memory error detector
==8272== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al.
==8272== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info
==8272== Command: ./a.out
==8272==
==8272==
==8272== Process terminating with default action of signal 11 (SIGSEGV)
==8272== Bad permissions for mapped region at address 0x400868
==8272== at 0x40069E: parseLine (asdf.c:22)
==8272== by 0x400790: main (asdf.c:52)
==8272==
==8272== HEAP SUMMARY:
==8272== in use at exit: 1 bytes in 1 blocks
==8272== total heap usage: 1 allocs, 0 frees, 1 bytes allocated
==8272==
==8272== LEAK SUMMARY:
==8272== definitely lost: 1 bytes in 1 blocks
==8272== indirectly lost: 0 bytes in 0 blocks
==8272== possibly lost: 0 bytes in 0 blocks
==8272== still reachable: 0 bytes in 0 blocks
==8272== suppressed: 0 bytes in 0 blocks
==8272== Rerun with --leak-check=full to see details of leaked memory
==8272==
==8272== For counts of detected and suppressed errors, rerun with: -v
==8272== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2)
Segmentation fault
In your code, this corresponds to the line tokens[tokenCount-1][strlen] = '\0';
.
Upvotes: 2
Reputation: 1215
1) The hard way:
char* tokens[10];
int i;
for(i = 0; i < maxTokens; i++)
{
tokens[i] = NULL;
}
The easy way:
char tokens[10] = { NULL };
2) This line won't copy a string ( it will just create another reference to it )
char* srcStr = str;
this will:
char* srcStr = (char*) malloc ( strlen(str) + 1 );
strcpy( srcStr , str );
3) Don't reinvent the wheel EXCEPT if you really have to. I have learned this the hard way. Believe me. You have made tons of mistakes in your function. If you really want to do this for "educational" purposes or something else, get some information on pointers and strings first
Upvotes: 6