Reputation: 10539
It seems that std::string - because it doesn't use expression templates - has a O(n^2) complexity instead of a possibly O(n) complexity for some operations like concatenation. Same thing with a std::stringstream class when you have to insert many elements.
I would like to understand this, at least if someone could have some good links about this point that would be great.
Upvotes: 3
Views: 468
Reputation: 58800
Concatenating multiple strings together has different complexities in C++ depending how it is done. I believe the situation that you're thinking of is:
string result = string("Hello, ") + username + "! " +
"Welcome to " + software_product + ".";
which concatenates 6 strings. The first string is copied 5 times, the second is copied 4 times, and so forth. As Leonid Volnitsky notes in his answer, the exact bound for this Θ(NM), where M is the number of concatenation operations, and N is the total length of the strings being concatenated. We can also call this O(N^2) when M <= N. Note that it's not guaranteed that M <= N, because you can increase M without increasing N by trying to concatenate the empty string.
Expression templates could help speed up this use case, though it would cause problems with auto
and decltype
type inference in C++11, as well as with template type inference in C++98. All of these would deduce the type of
auto result = string("Hello, ") + username + "! " +
"Welcome to " + software_product + ".";
to be the lazily-evaluated string template type that was used to make the expression template magic happen. Other reasons why expression templates are not a great idea include Leonid Volnitsky's answer about this slowing compilation time. It would probably also increase the size of your compiled binary.
Instead, there are other solutions in C++ could be used to get Θ(N) concatenation:
string result = "Hello, ";
result += username;
result += "! ";
result += "Welcome to ";
result += software_product;
result += ".";
In this version, the string is modified in place, and while data that has already been copied into result
sometimes needs to be recopied, C++ strings are typically implemented as dynamic arrays that allocate new space exponentially, so that the insertion of each new character takes amortized constant time, leading to overall Θ(N) behavior for repeated concatenation.
The following is a way of doing the same thing, almost on one line. It uses the same principle internally, but also supports converting non-string types to strings using << overloading.
stringstream result;
result << "Hello, " << username << "! " << "Welcome to " << software_product
<< ".";
// do something with result.str()
Lastly, the C++ standard library doesn't include this, but one could define the following function with some stringstream magic inside it. The implementation is left as an exercise for the reader.
template <typename... Items>
std::string concat(std::string const& a, std::string const& b, Items&&... args)
You can then call concat
for repeated concatenation on one line in O(N) time:
string result = concat("Hello, ", username, "! ", "Welcome to ",
software_product, ".");
Presumably, all of these are better solutions than messing up type inference by creating an expression template type.
Upvotes: 3
Reputation: 9144
If we have total length of strings expression N
, and M
concatenation operations then complexity should be O(NM)
.
It is definitely possible to speed up it up with expression templates. It was not done probably because of their complexity. Compilation speed would be slower too - it will need M-recursive types.
Upvotes: 2
Reputation: 126897
How can it be O(N^2)? String concatenation is:
I don't see how templates can have anything to do with this.
Upvotes: 1
Reputation: 14778
I find hard to believe that a concatenation would go up to O(n^2)
, but here goes some answer related to your question:
The C++ standard doesn't specify implementation details, and only specifies complexity requirements in some cases. The only complexity requirements on std::string operations are that size(), max_size(), operator[], swap(), c_str() and data() are all constant time. The complexity of anything else depends on the choices made by whoever implemented the library you're using.
See reference: C++ string::find complexity
Upvotes: 1