Reputation: 1483
The Quick C++ Benchmarks
example:
static void StringCopyFromLiteral(benchmark::State& state) {
// Code inside this loop is measured repeatedly
for (auto _ : state) {
std::string from_literal("hello");
// Make sure the variable is not optimized away by compiler
benchmark::DoNotOptimize(from_literal);
}
}
// Register the function as a benchmark
BENCHMARK(StringCopyFromLiteral);
static void StringCopyFromString(benchmark::State& state) {
// Code before the loop is not measured
std::string x = "hello";
for (auto _ : state) {
std::string from_string(x);
}
}
// Register the function as a benchmark
BENCHMARK(StringCopyFromString);
http://quick-bench.com/IcZllt_14hTeMaB_sBZ0CQ8x2Ro
What if I understand assembly...
More results:
http://quick-bench.com/39fLTvRdpR5zdapKSj2ZzE3asCI
Upvotes: 0
Views: 148
Reputation: 169143
The answer is simple. In the case where you construct an std::string
from a small string literal, the compiler optimizes this case by directly populating the contents of the string object using constants in assembly. This avoids expensive looping as well as tests to see whether small string optimization (SSO) can be applied. In this case it knows SSO can be applied so the code the compiler generates simply involves writing the string directly into the SSO buffer.
Note this assembly code in the StringCreation case:
// Populate SSO buffer (each set of 4 characters is backwards since
// x86 is little-endian)
19.63% movb $0x6f,0x4(%r15) // "o"
19.35% movl $0x6c6c6568,(%r15) // "lleh"
// Set size
20.26% movq $0x5,0x10(%rsp) // size = 5
// Probably set heap pointer. 0 (nullptr) = use SSO buffer
20.07% movb $0x0,0x1d(%rsp)
You're looking at the constant values right there. That's not very much code, and no loop is required. In fact, the std::string
constructor doesn't even have to be invoked! The compiler is just putting stuff in memory in the same places where the std::string
constructor would.
If the compiler cannot apply this optimization, the results are quite different -- in particular, if we "hide" the fact that the source is a string literal by first copying the literal into a char array, the results flip:
char x[] = "hello";
for (auto _ : state) {
std::string created_string(x);
benchmark::DoNotOptimize(created_string);
}
Now the "from-char-pointer" case takes twice as long! Why?
I suspect that this is because the "copy from char pointer" case cannot simply check to see how long the string is by looking at a value. It needs to know whether small string optimization can be performed. There's a few ways it could go about this:
Contrast this to the case when it's copying from another string object: it can simply look at the size()
of the other string and immediately know whether it can perform SSO, and if it can't perform SSO then it also knows exactly how much memory to allocate for the new string.
Upvotes: 4