Daniel K.
Daniel K.

Reputation: 987

Allocation in making copy of std::string

Which implementation do you think is better?

std::string ToUpper( const std::string& source )
{
    std::string result;
    result.resize( source.length() );
    std::transform( source.begin(), source.end(), result.begin(), 
        std::ptr_fun<int, int>( std::toupper ) );
    return result;
}

and...

std::string ToUpper( const std::string& source )
{
    std::string result( source.length(), '\0' );
    std::transform( source.begin(), source.end(), result.begin(), 
        std::ptr_fun<int, int>( std::toupper ) );
    return result;
}

Difference is that the first one uses reserve method after the default constructor, but the second one uses the constructor accepting the number of characters.

EDIT 1. I cannot use boost lib. 2. I just wanted to compare between the allocation-during-constructor and the allocation after-constructor.

Upvotes: 0

Views: 841

Answers (8)

Maxim Egorushkin
Maxim Egorushkin

Reputation: 136208

A straight-forward loop would be the simplest:

std::string toUpper(std::string s)
{
    for(size_t i = 0, j = s.size(); i != j; ++i)
        s[i] = toupper(s[i]);
    return s;
}

No need to use standard algorithms and functors here just for the sake of using them.

[Update]

You mentioned you can't use boost. I benchmarked just for fun the above version which writes to the result string twice against the following one that only writes once:

struct ToUpper
{
    typedef char result_type;
    char operator()(char c) const { return toupper(c); }
};

std::string toUpper2(std::string const& s)
{
    boost::transform_iterator<ToUpper, std::string::const_iterator> beg(s.begin());
    return std::string(beg, beg + s.size());
}

The results are:

500000 calls
toUpper : 414nsec/call
toUpper : 414nsec/call
toUpper : 414nsec/call
toUpper2: 301nsec/call
toUpper2: 302nsec/call
toUpper2: 302nsec/call

Upvotes: 1

Blastfurnace
Blastfurnace

Reputation: 18652

Since you're returning a transformed copy, why not pass-by-value and transform the already constructed temporary you receive?

std::string ToUpper( std::string source )
{
    std::transform( source.begin(), source.end(), source.begin(), ::toupper );
    return source;
}

Upvotes: 1

Nikko
Nikko

Reputation: 4252

what about a simple :

std::string result;
result.reserve( source.length() );

std::transform( source.begin(), source.end(), 
                std::back_inserter( result ),
                ::toupper );

?

Then personally I'd use boost instead.

Upvotes: 5

davka
davka

Reputation: 14672

The first one has a bug of copying onto an empty string (as already mentioned). Another way to correct this is to use back_inserter(result).

Given that, the first version seems marginally faster as it does not copy the initializing value to the result; however, there is a question of efficiency of regular insertion (as in 2) vs. insertion via back_inserter, i.e. push_back (as in 1).

In general, I'd suggest to prefer what you feel is a better style, and to be bothered with such such fine-grain optimization at this point, until you have a real evidence that you have a performance problem and this could be one of the bottlenecks. Pre-allocating the result is good enough. Beware Premature Optimization

Upvotes: 0

Matthieu M.
Matthieu M.

Reputation: 299730

#include <boost/algorithm/string.hpp>

std::string copy = boost::to_upper_copy(original);

Upvotes: 3

templatetypedef
templatetypedef

Reputation: 372674

The first of these two implementations is incorrect. The string::reserve function does not change the size of the string; rather it just allocates internal buffer space for it. This means that if you start writing into the string, you're overwriting characters that are physically part of the string but not logically part of the string. That is, the characters are there, but the string's length won't report them as being a part of the string.

However, I see why it is that you're trying to use reserve this way: If you use resize to make the string larger, then all the characters will have to have some value assigned to them, and this can contribute some inefficiency to the program. Therefore, you're looking for some way to avoid this cost. If you want to do this, consider using the following code:

string ToUpper(const std::string& source) {
    std::string result;
    result.reserve(source.length());

    for (string::const_iterator itr = source.begin(); itr != source.end(); ++itr)
        result += std::toupper(*itr);

    return result;
}

This first calls reserve on the result string to preallocate storage space for it. We then just walk across the elements of the first string, converting each to upper case and appending it to the resulting string using operator +=. Since space already exists in the string because of the call to reserve, there shouldn't be any new allocations involved, and the buffer will just be written to with the characters that you want.

Hope this helps!

Upvotes: 3

CB Bailey
CB Bailey

Reputation: 791421

The second one is better because it is less likely to write through an undereferencable iterator. If you've only reserved the storage, you need to append characters. The string is still empty. Consider std::back_insert_iterator.

Upvotes: 1

Tony Delroy
Tony Delroy

Reputation: 106068

The first one is better... the second is misleading in making it appear significant that the characters are set to NUL.

Upvotes: -2

Related Questions