Ryan Barker
Ryan Barker

Reputation: 113

C Programming - Functionality of strlen

I'm working to try and understand some string functions so I can more effectively use them in later coding projects, so I set up the simple program below:

#include <stdio.h>
#include <string.h>

int main (void)
{
// Declare variables:
char test_string[5];
char test_string2[] = { 'G', 'O', '_', 'T', 'E', 'S', 'T'};
int init; 
int length = 0;
int match;

// Initialize array:
for (init = 0; init < strlen(test_string); init++)
{    test_string[init] = '\0';
}

// Fill array:
test_string[0] = 'T';
test_string[1] = 'E';
test_string[2] = 'S';
test_string[3] = 'T';

// Get Length:
length = strlen(test_string);

// Get number of characters from string 1 in string 2:
match = strspn(test_string, test_string2);

printf("\nstrlen return = %d", length);
printf("\nstrspn return = %d\n\n", match);

return 0;
}

I expect to see a return of:

strlen return = 4 strspn return = 4

However, I see strlen return = 6 and strspn return = 4. From what I understand, char test_string[5] should allocate 5 bytes of memory and place hex 00 into the fifth byte. The for loop (which should not even be nessecary) should then set all the bytes of memory for test_string to hex 00. Then, the immediately proceeding lines should fill test_string bytes 1 through 4 (or test_string[0] through test_string[3]) with what I have specified. Calling strlen at this point should return a 4, because it should start at the address of string 0 and count an increment until it hits the first null character, which is at string[4]. Yet strlen returns 6. Can anyone explain this? Thanks!

Upvotes: 4

Views: 9483

Answers (3)

Keith Thompson
Keith Thompson

Reputation: 263197

char test_string[5];

test_string is an array of 5 uninitialized char objects.

for (init = 0; init < strlen(test_string); init++)

Kaboom. strlen scans for the first '\0' null character. Since the contents of test_string are garbage, the behavior is undefined. It might return a small value if there happens to be a null character, or a large value or program crash if there don't happen to be any zero bytes in test_string.

Even if that weren't the case, evaluating strlen() in the header of a for loop is inefficient. Each strlen() call has to re-scan the entire string (assuming you've given it a valid string), so if your loop worked it would be O(N2).

If you want test_string to contain just zero bytes, you can initialize it that way:

char test_string[5] = "";

or, since you initialize the first 4 bytes later:

char test_string[5] = "TEST";

or just:

char test_string[] = "TEST";

(The latter lets the compiler figure out that it needs 5 bytes.)

Going back to your declarations:

char test_string2[] = { 'G', 'O', '_', 'T', 'E', 'S', 'T'};

This causes test_string2 to be 7 bytes long, without a trailing '\0' character. That means that passing test_string2 to any function that expects a pointer to a string will cause undefined behavior. You probably want something like:

char test_string2[] = "GO_TEST";

Upvotes: 7

Graeme Perrow
Graeme Perrow

Reputation: 57238

There are a few problems here. First of all, char test_string[5]; simply sets aside 5 bytes for that string, but does not set the bytes to anything. In particular, when you say "char test_string[5] should allocate 5 bytes of memory and place hex 00 into the fifth byte", the second part is wrong.

Secondly, your array initialization loop uses strlen(test_string) but since the bytes of test_string are uninitialized, there's no way to know what's there so strlen(test_string) returns some undefined result. A better way to clear the array would be memset( test_string, 0, sizeof(test_string) );.

You fill the array with "TEST" but don't set the NULL byte at the end, so the last byte is still uninitialized. If you do the memset above this will be fixed, or you can manually do test_string[4] = '\0'.

Upvotes: 2

Eric Fortin
Eric Fortin

Reputation: 7603

strlen searches for '\0' character to count them, in your test_string, there is none so it continues until it finds one which happens to be 6 bytes away from the start of your array since it is uninitialized.

The compiler does not generate code to initialize the array so you don't have to pay to run that code if you fill it later.

To initialize it to 0 and skip the loop, you can use

char test_string[5] = {0};

This way, all character will be initialized to 0 and your strlen will work after you filled the array with "TEST".

Upvotes: 4

Related Questions