Devolus
Devolus

Reputation: 22074

Performance optimization for std::string

When I did some performance test in my app I noticed a difference in the following code (Visual Studio 2010).

Slower version

while(heavyloop)
{
   if(path+node+"/" == curNode)
   {
        do something
   }
}

This will cause some extra mallocs for the resulting string to be generated.

In order to avoid these mallocs, I changed it in the following way:

std::string buffer;
buffer.reserve(500);   // Big enough to hold all combinations without the need of malloc

while(heavyloop)
{
   buffer = path;
   buffer += node;
   buffer += "/";

   if(buffer == curNode)
   {
        do something
   }
}

While the second version looks a bit more awkward compared to the first version it's still readable enough. What I was wondering though is, wether this kind of optimization is an oversight on part of the compiler, or if this always has to be done manually. Since it only changes the order of allocations I would expect that the compiler could also figure it out on it's own. On the other hand, certain conditions have to be met, to really make it an optimization, which may not neccessarily be fullfilled, but if the conditions are not, the code would at least perform as good as the first version. Are newer versions of Visual Studio better in this regard?

A more complete version which shows the difference (SSCE):

std::string gen_random(std::string &oString, const int len)
{
    static const char alphanum[] =
        "0123456789"
        "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
        "abcdefghijklmnopqrstuvwxyz";

    oString = "";

    for (int i = 0; i < len; ++i)
    {
        oString += alphanum[rand() % (sizeof(alphanum) - 1)];
    }

    return oString;
}

int main(int argc, char *argv[])
{
    clock_t start = clock();
    std::string s = "/";
    size_t adds = 0;
    size_t subs = 0;
    size_t max_len = 0;

    s.reserve(100000);

    for(size_t i = 0; i < 1000000; i++)
    {
        std::string t1;
        std::string t2;
        if(rand() % 2)
        {
            // Slow version
            //s += gen_random(t1, (rand() % 15)+3) + "/" + gen_random(t2, (rand() % 15)+3);

            // Fast version
            s += gen_random(t1, (rand() % 15)+3);
            s += "/";
            s += gen_random(t2, (rand() % 15)+3);
            adds++;
        }
        else
        {
            subs++;
            size_t pos = s.find_last_of("/", s.length()-1);
            if(pos != std::string::npos)
                s.resize(pos);

            if(s.length() == 0)
                s = "/";
        }

        if(max_len < s.length())
            max_len = s.length();
    }
    std::cout << "Elapsed: " << clock() - start << std::endl;
    std::cout << "Added: " << adds << std::endl;
    std::cout << "Subtracted: " << subs << std::endl;
    std::cout << "Max: " << max_len << std::endl;

    return 0;
}

On my system I get about 1 second difference between the two (tested with gcc this time but there doesn't seem to be any notable difference to Visual Studio there):

Elapsed: 2669
Added: 500339
Subtracted: 499661
Max: 47197

Elapsed: 3417
Added: 500339
Subtracted: 499661
Max: 47367

Upvotes: 3

Views: 364

Answers (3)

Peter
Peter

Reputation: 36597

For class types (like std::string) there is no requirement that the conventional relationship between operator + and operator += be honoured like you expect. There is certainly no requirement that a = a + b and a += b have the same net effect, since operator=(), operator+() and operator+=() can all potentially be implemented individually, and not work together in tandem.

As such, a compiler would be semantically incorrect if it replaced

if(path+node+"/" == curNode)

with

std::string buffer = path;
buffer += node;
buffer += "/";
if (buffer == curNode)

If there was some constraint in the standard, for example a fixed relationship between overloaded operator+() and overloaded operator+=() then the two fragments of code would have the same net effect. However, there is no such constraint, so the compiler is not permitted to do such substitutions. The result would be changing meaning of the code.

Upvotes: 2

Stas
Stas

Reputation: 11761

Your slow version may be rewritten as

while(heavyloop)
{
   std::string tempA = path + node;
   std::string tempB = tempA + "/";

   if(tempB == curNode)
   {
        do something
   }
}

Yes, it is not a full analog, but makes temporary objects more visible.

See two temporary objects: tempA and tempB. They are created because std::string::operator+ always generates new std::string object. This is how std::string is designed. A compiler won't be able to optimize this code.

There is a technique in C++ called expression templates to address this issue, but again, it it done on library level.

Upvotes: 2

Jay Zhou
Jay Zhou

Reputation: 116

path+node+"/" will allocate a temp variable string to compare with curNode,it's the c++ implement.

Upvotes: 1

Related Questions