ManuelSchneid3r
ManuelSchneid3r

Reputation: 16091

Improve std::sort performance

I have an index with about 10k items, which have to be sorted caseinsensitive lexicographically.

This is my approach:

bool lowercomp (AbstractServiceProvider::AbstractItem*  i, AbstractServiceProvider::AbstractItem* j)

{
    std::string a,b;

    // lower first string
    a.resize(i->title().length());
    std::transform(i->title().cbegin(), i->title().cend(), a.begin(),
                std::bind2nd(std::ptr_fun(&std::tolower<char>), std::locale("")));

    // lower 2nd string
    b.resize(j->title().length());
    std::transform(j->title().cbegin(), j->title().cend(), b.begin(),
                std::bind2nd(std::ptr_fun(&std::tolower<char>), std::locale("")));

    return 0 > a.compare(b);
}

Somwhere in my code:

t = new boost::timer::auto_cpu_timer;
std::sort(_index.begin(), _index.end(), lowercomp);
delete t;

But this takes about 4 seconds. Without the toLower part, it takes about 0.003 seconds. Is there a way to improve this?

Upvotes: 0

Views: 2713

Answers (3)

John Zwinck
John Zwinck

Reputation: 249123

You can definitely make it much faster. The solution is to avoid allocating memory, and instead compare the strings case-insensitively by converting one character at a time using tolower() while doing the comparison. There should be no construction of class objects in the comparison function. Something like this:

bool lowercomp(const AbstractItem* lhs, const AbstractItem* rhs)  
{
    size_t size = std::min(lhs->title().size(), rhs->title().size());
    for (size_t pos = 0; pos < size; ++pos) {
        if (tolower(lhs->title()[pos]) < tolower(rhs->title()[pos]) {
            return true;
        } else if (tolower(lhs->title()[pos]) > tolower(rhs->title()[pos]) {
            return false;
        }
    }
    return lhs->title().size() < rhs->title().size();
}

Let us know how fast that is. :)

Upvotes: 4

James Kanze
James Kanze

Reputation: 153909

Until you've seen the profiler output, to know where the slowdown is, you can't be sure, but there are a number of points which seem likely to cause a slowdown to me. The two most important are:

  • your function creates two new strings at each call. That can be very expensive, and

  • you use the two operand form of std::tolower; this function must extract the ctype facet each time it is called (and you construct a new temporary instance of the locale each time you invoke lowercomp.

My own preference is to use a functional object for the comparison. With some compilers, it's faster, but in this case, it's also a lot cleaner:

class CaseInsensitiveCompare
{
    std::locale myLocale;   //  To ensure lifetime of the facet.
    std::ctype<char> const& myCType;
public:
    CaseInsensitiveCompare( std::locale const& locale = std::locale( "" ) )
        : myLocale( locale )
        , myCType( std::use_facet<std::ctype<char>>( myLocal ) )
    {
    }
    bool operator()( AbstractItem const* lhs, AbstractItem const* rhs ) const
    {
        return (*this)( lhs->title(), rhs->title() );
    }
    bool operator()( std::string const& lhs, std::string const& rhs ) const
    {
        return std::lexicographical_compare(
            lhs.begin(), lhs.end(),
            rhs.begin(), rhs.end(),
            *this);
    }
    bool operator()( char lhs, char rhs ) const
    {
        return myCType.tolower(lhs) < myCType.tolower(rhs);
    }
};

Beyond this, there are several other points which might improve performance:

  • If you're sure of the lifetime of the locale you're using (and you usually can be), drop the myLocale member in the class; copying the locale will be the most expensive part of copying instances of this class (and the call to lexicographical_compare will copy it at least once).

  • If you don't need the localization features, consider using the tolower function in <cctype>, rather than the one in <locale>. This will avoid the need of any data members at all in the comparison.

  • Finally, although I'm not sure it's worth it for something as small as 10K items, you might consider making vectors with the canonical forms of the strings (already lower cased, etc.), sorting those using just < on the strings, and then reordering the original vectors according to that.

Also, I'm very suspicious of the `new boost::timer::auto_cpu_timer'. Do you really need dynamic allocation here? Off hand, I suspect a local variable would be more appropriate.

Upvotes: 4

John Dibling
John Dibling

Reputation: 101456

Your implementation strikes me as extraordinarily inefficient. I see several problems.

You are performing a tolower on both strings within the sort comparator. Since this comparator will be called on the order of n log n times, you are going to be tolowering two strings about 40K times (?) each.

I would not want to compare strings at all. Not only is a string comparison orders of magnitude less efficient than other methods (for example integral comparison), it's also error-prone and requires that you scrub the data -- another source of inefficiency.

If however you must compare strings, scrub them before you perform the sort. That includes tolowering them. Ideally, scrub the data upon element construction. Barring that, you could even scrub it just before you call sort. Whatever you do, don't scrub it within the comparator.

Upvotes: 0

Related Questions