Reputation: 3224
I had a question about the use of std::search
vs string::find
for dealing with strings. I know it is often better to use a class specific member function algorithm over the standard library algorithm because it can optimize based on the class, but I was wondering whether it is reasonable to, say for the sake of consistency, use std::search
with the iterators rather than string::find
with indices.
Would it be a sin for me to do something like that or should I just stick to string::find? Are there any huge advantages of one over the other in terms of performance or style?
Upvotes: 14
Views: 6751
Reputation: 38272
string::find
uses linear search but it is several times faster than Boyer Moore for some cases (with the latest patch). I submitted a patch (first-element then memcomp) to both libstdc++ and libc++ which improved string::find
significantly. You can try the recent gcc (7.1) and you will get the improved performance. You can also measure the performance with the simple benchmarking suite I wrote:
https://github.com/hiraditya/std-benchmark
Especially for smaller strings, by the time Boyer Moore is busy constructing internal data structure, (sub) linear string::find will be done. Also for parsing HTML etc., where most of the searches are mismatches, string::find should be faster.
commit fc7ebc4b8d9ad7e2891b7f72152e8a2b7543cd65
Author: redi <redi@138bc75d-0d04-0410-961f-82ee72b054a4>
Date: Mon Jan 9 13:05:58 2017 +0000
PR66414 optimize std::string::find
2017-01-09 Jonathan Wakely <[email protected]>
Aditya Kumar <[email protected]>
PR libstdc++/66414
* include/bits/basic_string.tcc
(basic_string::find(const CharT*, size_type, size_type)): Optimize.
git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@244225 138bc75d-0d04-0410-961f-82ee72b054a4
PS: Using std::find
will always be slower than the current std::string::find
with the current implementation.
Upvotes: 9
Reputation: 5520
Right now (27th April 2017), at least GCCs libstdc++
(which is also used by clang by default), implements std::string::find
with a linear search and thus is much slower than using
std::string_view substr{"whatever"};
auto it = std::search(s.cbegin(), s.cend(),
std::boyer_moore_searcher(substr.begin(), substr.end()));
The problem is that the Boyer-Moore searcher allocates memory for internal data structures, and thus can fail with a std::bad_alloc
exception. However, std::string::find
is marked noexcept
, so using the already implemented Boyer-Moore searcher within std::string::find
isn't straight-forward.
Upvotes: 10