Chris Smith
Chris Smith

Reputation: 764

0x00 and char arrays

Why do char arrays stop right before a 0x00 byte is detected and how can this problem be avoided (perhaps by using another datatype (which one and why) or a "trick" with char)?

For example in the following code, the output is "a" only, the other bytes are not displayed:

unsigned char cbuffer[]={0x61,0x00,0x62,0x63,0x0};
std::string sbuffer=reinterpret_cast<const char*>(cbuffer);

cout << sbuffer << endl;

Similarly in the following code, the output is "ab":

unsigned char cbuffer[]={0x61,0x62,0x00,0x63,0x0};
std::string sbuffer=reinterpret_cast<const char*>(cbuffer);

Straightforward and effective workarounds to the problem (where 0x00 is kept in the array as a normal byte) would be appreciated.

Upvotes: 0

Views: 19994

Answers (6)

meolic
meolic

Reputation: 1207

What do you want to be substituted (and printed) for 0x00 in the resulting string?

The constructor is responsible for conversion of char[] into a string. As others pointed out, you must use different constructor. The code below is working for me, but it is not very roboust. The first parameter must be a pointer to the array (you are free to use safer casting) and the second parameter is the length of the array (you are free to calculate this in a more sophisticated way).

#include <iostream>
int main() {
  unsigned char cbuffer[]={0x61,0x00,0x62,0x63,0x00};
  std::string sbuffer((char *)cbuffer,5);
  std::cout << sbuffer << std::endl;
}

Upvotes: 0

Mooing Duck
Mooing Duck

Reputation: 66952

It's common in C to pass around strings as pointers to null-terminated char arrays. null is represented by 0x00. To make conversion easy, the std::string is constructable from a pointer to a null-terminated char array, which is what is happening with your code. But when it finds the null, it thinks that's the end of the string. If you cout a char array directly, you'll find it makes the same assumption, because they have no other way to determine the end of a string pointed to by a char*. (They could theoretically tell the length in your case, if they understood char (&)[], but almost nothing in the standard library does sadly).

The intended workarounds are to use this constructor instead:

int len = sizeof(cbuffer)/sizeof(cbuffer[0]);
std::string sbuffer(cbuffer, len); //5 characters in cbuffer, 1 byte each

or

int len = sizeof(cbuffer)/sizeof(cbuffer[0]);
std::cout.write(cbuffer, len); //5 characters in buffer, 1 byte each

However, you have to be careful with sizeof(cbuffer). If cbuffer is a char* (pointer) instead of a char(&)[] (array), then sizeof(ptr) will return the wrong value, and there is no way to get the correct length at that point, if the string is not null-terminated.

Upvotes: 6

shibumi
shibumi

Reputation: 378

The 0x00 byte is used as a sentinel to mark the end of the string in C. The entire array however remains in memory. You can use an alternate constructor for std::string if you want the string to contain the entire character array. But printing that string would still give you only "ab". This decision to represent C strings in this manner is one of those arbitrary decisions that we are stuck with.

Upvotes: 1

Loki Astari
Loki Astari

Reputation: 264561

Try:

#include <iostream>
#include <string>

int main()
{

    unsigned char cbuffer[]={0x61,0x62,0x00,0x63,0x0};

    // Here s1 is treating the cBuffer as a C-String
    // Thus it will only read upto the first '\0' character
    std::string s1(reinterpret_cast<const char*>(cbuffer));
    std::cout << s1 << "\n";

    // Here s2 is treating the cBuffer as an array.
    // It reads the specified length into the string.
    std::string s2(reinterpret_cast<const char*>(cbuffer), sizeof(cbuffer)/sizeof(cbuffer[0]));

    // Note: How std::cout prints the '\0' character may leave it empty. 
    std::cout << s2 << "\n";

}

Upvotes: 1

Martin Beckett
Martin Beckett

Reputation: 96139

char arrays don't do anything

The C string functions use 0 to mark the end of a string.
std::cout is overloaded for char arrays to print them as 'c' strings, if you want to print individual values you need to loop over the values, you might also want to output them as std::hex

In this case you are creating a std::String from a 'c' char array so the ctor of the std::string assumes that 'c' strings end at '0'. Since it's only passed an address in memory how else can it know where the string ends?

ps. If you want to store an array of bytes you should probably be using std::vector

Upvotes: 2

John
John

Reputation: 6668

0x00 is a non print character, 0..0x20, are all non print chars although some serve as line breaks. 0x00 serves to terminate a string.

Upvotes: 0

Related Questions