Reputation: 22074
When I did some performance test in my app I noticed a difference in the following code (Visual Studio 2010).
Slower version
while(heavyloop)
{
if(path+node+"/" == curNode)
{
do something
}
}
This will cause some extra mallocs for the resulting string to be generated.
In order to avoid these mallocs, I changed it in the following way:
std::string buffer;
buffer.reserve(500); // Big enough to hold all combinations without the need of malloc
while(heavyloop)
{
buffer = path;
buffer += node;
buffer += "/";
if(buffer == curNode)
{
do something
}
}
While the second version looks a bit more awkward compared to the first version it's still readable enough. What I was wondering though is, wether this kind of optimization is an oversight on part of the compiler, or if this always has to be done manually. Since it only changes the order of allocations I would expect that the compiler could also figure it out on it's own. On the other hand, certain conditions have to be met, to really make it an optimization, which may not neccessarily be fullfilled, but if the conditions are not, the code would at least perform as good as the first version. Are newer versions of Visual Studio better in this regard?
A more complete version which shows the difference (SSCE):
std::string gen_random(std::string &oString, const int len)
{
static const char alphanum[] =
"0123456789"
"ABCDEFGHIJKLMNOPQRSTUVWXYZ"
"abcdefghijklmnopqrstuvwxyz";
oString = "";
for (int i = 0; i < len; ++i)
{
oString += alphanum[rand() % (sizeof(alphanum) - 1)];
}
return oString;
}
int main(int argc, char *argv[])
{
clock_t start = clock();
std::string s = "/";
size_t adds = 0;
size_t subs = 0;
size_t max_len = 0;
s.reserve(100000);
for(size_t i = 0; i < 1000000; i++)
{
std::string t1;
std::string t2;
if(rand() % 2)
{
// Slow version
//s += gen_random(t1, (rand() % 15)+3) + "/" + gen_random(t2, (rand() % 15)+3);
// Fast version
s += gen_random(t1, (rand() % 15)+3);
s += "/";
s += gen_random(t2, (rand() % 15)+3);
adds++;
}
else
{
subs++;
size_t pos = s.find_last_of("/", s.length()-1);
if(pos != std::string::npos)
s.resize(pos);
if(s.length() == 0)
s = "/";
}
if(max_len < s.length())
max_len = s.length();
}
std::cout << "Elapsed: " << clock() - start << std::endl;
std::cout << "Added: " << adds << std::endl;
std::cout << "Subtracted: " << subs << std::endl;
std::cout << "Max: " << max_len << std::endl;
return 0;
}
On my system I get about 1 second difference between the two (tested with gcc this time but there doesn't seem to be any notable difference to Visual Studio there):
Elapsed: 2669
Added: 500339
Subtracted: 499661
Max: 47197
Elapsed: 3417
Added: 500339
Subtracted: 499661
Max: 47367
Upvotes: 3
Views: 364
Reputation: 36597
For class types (like std::string
) there is no requirement that the conventional relationship between operator +
and operator +=
be honoured like you expect. There is certainly no requirement that a = a + b
and a += b
have the same net effect, since operator=()
, operator+()
and operator+=()
can all potentially be implemented individually, and not work together in tandem.
As such, a compiler would be semantically incorrect if it replaced
if(path+node+"/" == curNode)
with
std::string buffer = path;
buffer += node;
buffer += "/";
if (buffer == curNode)
If there was some constraint in the standard, for example a fixed relationship between overloaded operator+()
and overloaded operator+=()
then the two fragments of code would have the same net effect. However, there is no such constraint, so the compiler is not permitted to do such substitutions. The result would be changing meaning of the code.
Upvotes: 2
Reputation: 11761
Your slow version may be rewritten as
while(heavyloop)
{
std::string tempA = path + node;
std::string tempB = tempA + "/";
if(tempB == curNode)
{
do something
}
}
Yes, it is not a full analog, but makes temporary objects more visible.
See two temporary objects: tempA
and tempB
. They are created because std::string::operator+ always generates new std::string
object. This is how std::string
is designed. A compiler won't be able to optimize this code.
There is a technique in C++ called expression templates to address this issue, but again, it it done on library level.
Upvotes: 2
Reputation: 116
path+node+"/" will allocate a temp variable string to compare with curNode,it's the c++ implement.
Upvotes: 1