gowrath
gowrath

Reputation: 3224

Using `std::search` over `string::find`

I had a question about the use of std::search vs string::find for dealing with strings. I know it is often better to use a class specific member function algorithm over the standard library algorithm because it can optimize based on the class, but I was wondering whether it is reasonable to, say for the sake of consistency, use std::search with the iterators rather than string::find with indices.

Would it be a sin for me to do something like that or should I just stick to string::find? Are there any huge advantages of one over the other in terms of performance or style?

Upvotes: 14

Views: 6751

Answers (2)

A. K.
A. K.

Reputation: 38272

string::find uses linear search but it is several times faster than Boyer Moore for some cases (with the latest patch). I submitted a patch (first-element then memcomp) to both libstdc++ and libc++ which improved string::find significantly. You can try the recent gcc (7.1) and you will get the improved performance. You can also measure the performance with the simple benchmarking suite I wrote: https://github.com/hiraditya/std-benchmark

Especially for smaller strings, by the time Boyer Moore is busy constructing internal data structure, (sub) linear string::find will be done. Also for parsing HTML etc., where most of the searches are mismatches, string::find should be faster.

commit fc7ebc4b8d9ad7e2891b7f72152e8a2b7543cd65
Author: redi <redi@138bc75d-0d04-0410-961f-82ee72b054a4>
Date:   Mon Jan 9 13:05:58 2017 +0000

    PR66414 optimize std::string::find

    2017-01-09  Jonathan Wakely  <[email protected]>
            Aditya Kumar  <[email protected]>

        PR libstdc++/66414
        * include/bits/basic_string.tcc
        (basic_string::find(const CharT*, size_type, size_type)): Optimize.

    git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@244225 138bc75d-0d04-0410-961f-82ee72b054a4

PS: Using std::find will always be slower than the current std::string::find with the current implementation.

Upvotes: 9

Corristo
Corristo

Reputation: 5520

Right now (27th April 2017), at least GCCs libstdc++ (which is also used by clang by default), implements std::string::find with a linear search and thus is much slower than using

std::string_view substr{"whatever"};
auto it = std::search(s.cbegin(), s.cend(),
                      std::boyer_moore_searcher(substr.begin(), substr.end())); 

The problem is that the Boyer-Moore searcher allocates memory for internal data structures, and thus can fail with a std::bad_alloc exception. However, std::string::find is marked noexcept, so using the already implemented Boyer-Moore searcher within std::string::find isn't straight-forward.

Upvotes: 10

Related Questions