Reputation: 885
I've understood that string arrays end with a '\0' symbol. So, the following code should print 0, 1, 2 and 3. (Notice I'm using a range-based for() loop).
$ cat app.cpp
#include <iostream>
int main(){
char s[]="0123\0abc";
for(char c: s) std::cerr<<"-->"<<c<<std::endl;
return 0;
}
But it does print the whole array, including '\0's.
$ ./app
-->0
-->1
-->2
-->3
-->
-->a
-->b
-->c
-->
$ _
What is happening here? Why is the string not considered to end with '\0'? Do C++ collections consider (I imagine C++11) strings differently than in classical C++?
Moreover, the number of characters in "0123\0abc"
is 8. Notice the printout makes 9 lines!
(I know that std::cout<<
runs fine, as well as strlen()
, as well as for(int i=s; s[i]; i++)
, etc., I know about the end terminator, that's not the question!).
Upvotes: 1
Views: 266
Reputation: 25518
Be aware that char
not necessarily needs to define a character only – it can be used to store any arbitrary 8-bit value (on some machines, char is wider, though, encountered one with a 16-bit char already – then there's no int8_t
available...), although signed char
or unsigned char
– according to specific needs – should be preferred, as signedness of char
is implementation defined (or even better: int8_t
or uint8_t
from cstdint
header, provided they are available).
So your string literal actually is just an array of nine integral values (just as if you had created an int-array, only the type usually is narrower). A range based for loop will iterate over all of these nine 8-bit integers, and you get the output in your example.
These integral values only get a special meaning in specific contexts (functions), such as printf
, puts
or even operator>>
, where they are then interpreted as characters. When used as C-strings, a 0 value inside such an array marks the end of the string – but this 0-character still is part of that string. For illustration: puts
might look like this:
int puts(char const* str)
{
while(!*str) // stops on encountering null character
{
char c = *str;
// + get pixel representation of c for console, e. g 'B' for 66
// + print this pixel representation to current console position
// + advance by one position on console
++str;
}
return 0; // non-negative for success, could return number of
// characters output as well...
}
Upvotes: 2
Reputation: 793
Here s is an array of char, so it includes \0
too.
When you use for(char c: s)
, the loop will search all char
in the array.
But in C, the definition tells us:
A string is a contiguous sequence of characters terminated by and including the first null character.
And
[...] The length of a string is the number of bytes preceding the null character and the value of a string is the sequence of the values of the contained characters...
So, when you use C standard functions to print the array s as a string, you will see the result that you wanted. Example: printf("%s", s);
"the number of characters in "0123\0abc" is 8. Notice the printout makes 9 lines!"
Again, printf("%s; Len = %d", s, strlen(s));
runs fine!
Upvotes: 0
Reputation: 172864
s
is of type char [9]
, i.e. an array containing 9 char
s (including the null terminator char '\0'). Ranged-based for loop just iterators over all the 9 elements, the null terminator char '\0'
is not considered specially.
Executes a for loop over a range.
Used as a more readable equivalent to the traditional for loop operating over a range of values, such as all elements in a container.
for(char c: s) std::cerr<<"-->"<<c<<std::endl;
produces code prototype equivalent to
{
auto && __range = s ;
auto __begin = __range ; // get the pointer to the beginning of the array
auto __end = __range + __bound ; // get the pointer to the end of the array ( __bound is the number of elements in the array, i.e. 9 )
for ( ; __begin != __end; ++__begin) {
char c = *__begin;
std::cerr<<"-->"<<c<<std::endl;
}
}
Upvotes: 3
Reputation: 117178
When you declare a char[]
as char s[] = "0123\0abc"
(a string literal), s
becomes a char[9]
. The \0
is included because it needs space too.
The range-based for-loop you use does not consider the char[9]
as anything else than an array containing char
with the extent 9
and will happily provide every element in the array to the inner workings of your loop. The \0
is just one of the char
s in this context.
Upvotes: 2