RodolfoAP
RodolfoAP

Reputation: 885

String array length C++ issue?

I've understood that string arrays end with a '\0' symbol. So, the following code should print 0, 1, 2 and 3. (Notice I'm using a range-based for() loop).

$ cat app.cpp
    #include <iostream>
    int main(){
        char s[]="0123\0abc";
        for(char c: s) std::cerr<<"-->"<<c<<std::endl;
        return 0;
    }

But it does print the whole array, including '\0's.

$ ./app
-->0
-->1
-->2
-->3
-->
-->a
-->b
-->c
-->

$ _

What is happening here? Why is the string not considered to end with '\0'? Do C++ collections consider (I imagine C++11) strings differently than in classical C++?

Moreover, the number of characters in "0123\0abc" is 8. Notice the printout makes 9 lines!

(I know that std::cout<< runs fine, as well as strlen(), as well as for(int i=s; s[i]; i++), etc., I know about the end terminator, that's not the question!).

Upvotes: 1

Views: 266

Answers (4)

Aconcagua
Aconcagua

Reputation: 25518

Be aware that char not necessarily needs to define a character only – it can be used to store any arbitrary 8-bit value (on some machines, char is wider, though, encountered one with a 16-bit char already – then there's no int8_t available...), although signed char or unsigned char – according to specific needs – should be preferred, as signedness of char is implementation defined (or even better: int8_t or uint8_t from cstdint header, provided they are available).

So your string literal actually is just an array of nine integral values (just as if you had created an int-array, only the type usually is narrower). A range based for loop will iterate over all of these nine 8-bit integers, and you get the output in your example.

These integral values only get a special meaning in specific contexts (functions), such as printf, puts or even operator>>, where they are then interpreted as characters. When used as C-strings, a 0 value inside such an array marks the end of the string – but this 0-character still is part of that string. For illustration: puts might look like this:

int puts(char const* str)
{
    while(!*str) // stops on encountering null character 
    {
        char c = *str;

        // + get pixel representation of c for console, e. g 'B' for 66
        // + print this pixel representation to current console position
        // + advance by one position on console

        ++str;
    }
    return 0; // non-negative for success, could return number of
              // characters output as well...
}

Upvotes: 2

dqthe
dqthe

Reputation: 793

  • Here s is an array of char, so it includes \0 too. When you use for(char c: s), the loop will search all char in the array. But in C, the definition tells us:

    A string is a contiguous sequence of characters terminated by and including the first null character.

    And

    [...] The length of a string is the number of bytes preceding the null character and the value of a string is the sequence of the values of the contained characters...

    So, when you use C standard functions to print the array s as a string, you will see the result that you wanted. Example: printf("%s", s);

  • "the number of characters in "0123\0abc" is 8. Notice the printout makes 9 lines!"

    Again, printf("%s; Len = %d", s, strlen(s)); runs fine!

Upvotes: 0

songyuanyao
songyuanyao

Reputation: 172864

s is of type char [9], i.e. an array containing 9 chars (including the null terminator char '\0'). Ranged-based for loop just iterators over all the 9 elements, the null terminator char '\0' is not considered specially.

Executes a for loop over a range.

Used as a more readable equivalent to the traditional for loop operating over a range of values, such as all elements in a container.

for(char c: s) std::cerr<<"-->"<<c<<std::endl; produces code prototype equivalent to

{
  auto && __range = s ;
  auto __begin = __range ;         // get the pointer to the beginning of the array 
  auto __end = __range + __bound ; // get the pointer to the end of the array ( __bound is the number of elements in the array, i.e. 9 )
  for ( ; __begin != __end; ++__begin) {
    char c = *__begin;
    std::cerr<<"-->"<<c<<std::endl;
  }
}

Upvotes: 3

Ted Lyngmo
Ted Lyngmo

Reputation: 117178

When you declare a char[] as char s[] = "0123\0abc" (a string literal), s becomes a char[9]. The \0 is included because it needs space too.

The range-based for-loop you use does not consider the char[9] as anything else than an array containing char with the extent 9 and will happily provide every element in the array to the inner workings of your loop. The \0 is just one of the chars in this context.

Upvotes: 2

Related Questions