Reputation: 3763

Efficiency of C string vs C++ strings

The book C++ Primer says

For most applications, in addition to being safer, it is also more efficient to use library strings rather then C-style strings

Safety is understood. Why is the C++ strings library more efficient? After all, underneath it all, aren't strings still represented as character arrays?

To clarify, does the author talk about programmer efficiency (understood) or processing efficiency?

Upvotes: 37

Answers (8)

Intrepidd

Reputation: 20948

Here is a short point of view.

First of all, C++ strings are objects, so it is more consistent to use them in an object-oriented language.

Then, the standard library comes with a lot of useful functions for strings, iterators, etc. All this stuff is stuff you won't have to code again, so you gain time and you're sure that this code is (almost) bugless.

Finally, C strings are pointers that are kind of difficult to understand when you're new, and they bring complexity. Since references are preferred over pointers in C++, it makes more sense to use std::string instead of C strings.

Upvotes: 1

Coding Mash

Reputation: 3346

Strings are the object which contain character arrays within themselves along with their size and other functionalities. It is better to use strings from a strings library because they save you from allocating and deallocating memory, looking out for memory leaks and other pointer hazards. But as strings are objects, so they take extra space in memory.

C strings are simply character arrays. They should be used when you are working in real time; when you do not know completely about how much memory space you have in hand. If you are using C strings, you would have to take care for memory allocation, then copying data into it via strcpy or character by character, then deallocating after its usage, etc., etc.

So better use strings from a string library if you want to avoid a bunch of headaches.

Strings increase program efficiency but reduce processing efficiency (though not necessarily). Vice versa is with C strings.

Upvotes: 3

Leonid Volnitsky

Reputation: 9154

C-strings are usually faster, because they do not call malloc/new. But there are cases where std::string is faster. Function strlen() is O(N), but std::string::size() is O(1).

Also when you search for substring, in C strings you need to check for '\0' on every cycle, in std::string - you don't. In a naive substring search algorithm it doesn't matter much, because instead of checking for '\0' you need to check for i<s.size(). But modern high-performance substring search algorithms traverse strings in multibyte steps. And the need for a '\0' check in every byte slows them down. This is the reason why GLIBC memmem is x2 times faster than strstr. I did a lot of benchmarking of substring algorithms.

This is true not only for substring search algorithm. Many other string processing algorithms are slower for zero-terminated strings.

Upvotes: 29

Sarfaraz Nawaz

Reputation: 361812

Why is C++ strings library more efficient? After all, underneath it all, aren't strings still represented as character arrays?

Because the code which uses char* or char[] is more likely to be inefficent if not written carefully. For example, have you seen loop like this:

char *get_data();

char const *s = get_data(); 

for(size_t i = 0 ; i < strlen(s) ; ++i) //Is it efficent loop? No.
{
   //do something
}

Is that efficient? No. The time-complexity of strlen() is O(N), and furthermore, it is computed in each iteration, in the above code.

Now you may say "I can make it efficient if I call strlen() just once.". Of course, you can. But you have to do all that sort of optimization yourself and conciously. If you missed something, you missed CPU cycles. But with std::string, many such optimization is done by the class itself. So you can write this:

std::string get_data();

std::string const & s = get_data(); //avoid copy if you don't need  it

for(size_t i = 0 ; i < s.size() ; ++i) //Is it efficent loop? Yes.
{
   //do something
}

Is that efficient? Yes. The time-complexity of size() is O(1). No need to optimize it manually which often makes code look ugly and hard to read. The resulting code with std::string is almost always neat and clean in comparison to char*.

Also note that std::string not only makes your code efficent in terms of CPU cycles, but it also increases programmer efficency!

Upvotes: 24

supercat

Reputation: 81347

The difficulty with C-style strings is that one really can't do much with them unless one knows about the data structures in which they are contained. For example, when using "strcpy", one must know that the destination buffer is writable, and has enough space to accommodate everything up to the first zero byte in the source (of course, in all too many cases, one doesn't really know that for certain...). Very few library routines provide any support for allocating space on demand, and I think all those that do work by allocating it unconditionally (so if one had a buffer with space for 1000 bytes, and one wants to copy a 900-byte string, code using those methods would have to abandon the 1000-byte buffer and then create a new 900-byte buffer, even though it might be better to simply reuse the 1000-byte buffer).

Working with an object-oriented string type would in many cases not be as efficient as working with standard C-strings but figuring out the optimal ways to allocate and reuse things. On the other hand, code which is written to optimally allocate and reuse strings may be very brittle, and slight changes to requirements could require making lots of tricky little tweaks to the code--failing to tweak the code perfectly would likely result in bugs which may be obvious and severe, or subtle but even more severe. The most practical way to avoid brittleness in code which uses standard C strings is to design it very conservatively. Document maximum input-data sizes, truncate anything which is too big, and use big buffers for everything. Workable, but not terribly efficient.

By contrast, if one uses the object-oriented string types, the allocation patterns they use will likely not be optimal, but will likely be better than the 'allocate everything big' approach. They thus combine much of the run-time efficiency of the hand-optimized-code approach with safety that's better than the 'allocate everything big' approach.

Upvotes: 1

Jonathan Wakely

Reputation: 171501

A std::string knows its length, which makes many operations quicker.

For example, given:

const char* c1 = "Hello, world!";
const char* c2 = "Hello, world plus dog!";
std::string s1 = c1;
std::string s2 = c2;

strlen(c1) is slower than s1.length(). For comparisons, strcmp(c1, c2) has to compare several characters to determine the strings are not equal, but s1 == s2 can tell the lengths are not the same and return false immediately.

Other operations also benefit from knowing the length in advance, e.g. strcat(buf, c1) has to find the null terminator in buf to find where to append data, but s1 += s2 knows the length of s1 already and can append the new characters at the right place immediately.

When it comes to memory management, std::string allocates additional space every time it grows, which means future append operations don't need to reallocate.

Upvotes: 9

Christian Rau

Reputation: 45968

Well, an obvious and simple thing how they could be practically more efficient (regarding runtime) is, that they store the string's length along with the data (or at least their size method has to be O(1), which says practically the same).

So whenever you would need to find the NUL character in a C string (and thus walk the whole string once) you can just get the size in constant time. And this happens quite a lot, e.g. when copying or concatenating strings and thus allocating a new one beforehand, whose size you need to know.

But I don't know if this is what the author meant or if it makes a huge difference in practice, but it still is a valid point.

Upvotes: 3

John Calsbeek

Reputation: 36547

There are some cases in which a std::string might beat a char[]. For example, C-style strings typically don't have an explicit length passed around—instead, the NUL terminator implicitly defines the length.

This means that a loop which continually strcats onto a char[] is actually performing O(n²) work, because each strcat has to process the entire string in order to determine the insertion point. In contrast, the only work that a std::string needs to perform to concatenate onto the end of a string is to copy the new characters (and possibly reallocate storage—but for the comparison to be fair, you have to know the maximum size beforehand and reserve() it).

Upvotes: 7

Efficiency of C string vs C++ strings

Answers (8)

Related Questions