Xlea
Xlea

Reputation: 526

Difference between strlen(str.c_str()) and str.length() for std::string

As an implicit understanding, I always thought that every implementation of std::string necessarily must satisfy strlen(str.c_str()) == str.length() for every string str.

What does the C++ standard say about this? (Does it?)

Background: At least the implementations shipped with Visual C++ and gcc do not have this property. Consider this example (see here for a live example):

// Output:
// string says its length is: 13
// strlen says: 5
#include <iostream>
#include <cstring>
#include <string>

int main() {
  std::string str = "Hello, world!";
  str[5] = 0;
  std::cout << "string says its length is: " << str.length() << std::endl;
  std::cout << "strlen says: " << strlen(str.c_str()) << std::endl;
  return 0;
}

Of course, the writing operation without str noticing is causing "the problem". But that's not my question. I want to know what the standard has to say about this behavior.

Upvotes: 11

Views: 20247

Answers (3)

Vlad from Moscow
Vlad from Moscow

Reputation: 311126

Standard C function std::strlen calculates the length of a character array based on the presence of the terminating zero in the array. On the other hand objects of class std::string may have embedded zeroes. Thus function strlen applied to c_str() can yields result that differs from the value returned by member function length.

Consider a simple example

std::string s( 10, '\0' );

std::cout << s.length() << std::endl;
std::cout << std::strlen( s.c_str() ) << std::endl;

In this case the first output statement will output 10 while the second output statement will output 0.

Moreover if you have a string like for example

std::string s( "Hello" );

and then call member function resize

s.resize( 10 );

then the function appends the original string with four values of type char() that is by zeroes. And member function s.length() returns 10.

Upvotes: 4

Lightness Races in Orbit
Lightness Races in Orbit

Reputation: 385385

Your understanding is incorrect. Sort of.

std::string may contain chars with the value '\0'; when you extract a C-string, you have no way of knowing how long it is other than to scan for \0s, which by necessity cannot account for "binary data".

This is a limitation of strlen, and one that std::string "fixes" by actually remembering this metadata as a count of chars that it knows are encapsulated.

The standard doesn't really need to "say" anything about it, except that std::string::length gives you the string length, no matter what the value of the chars you inserted into the string, and that is it not prohibited to insert a '\0'. By contrast, strlen is defined to tell you how many chars exist up to the next \0, which is a fundamentally different definition.

There is no explicit wording about this, because there does not need to be. If there were an exception to the very simple rules ("there is a string; it has chars; it can tell you how many chars it has") then that would be stated explicitly… and it's not.

Upvotes: 16

NathanOliver
NathanOliver

Reputation: 181047

The standard has this to say about length() from string

Returns: size().

And size() is defined as

Returns: A count of the number of char-like objects currently in the string.

So as you can see you will get the number of char like objects in the string even if the char like objects value is '\0'.

Upvotes: 1

Related Questions