Reputation: 2065
I have read the latest draft where lazy_split_view
is added.
But later on, I realized that split_view
was renamed into lazy_split_view
, and the split_view
was renewed.
libstdc++
also recently implemented this by using GCC Trunk
version https://godbolt.org/z/9qG5T9n5h
I have a simple naive program here that shows the usage of two views, but I can't see their differences:
#include <iostream>
#include <ranges>
int main(){
std::string str { "one two three four" };
for (auto word : str | std::views::split(' ')) {
for (char ch : word)
std::cout << ch;
std::cout << '.';
}
std::cout << '\n';
for (auto word : str | std::views::lazy_split(' ')) {
for (char ch : word)
std::cout << ch;
std::cout << '.';
}
}
Output:
one.two.three..four.
one.two.three..four.
until I've noticed the differences when using as std::span<const char>
for both views.
In the first one: std::views::split
:
for (std::span<const char> word : str | std::views::split(' '))
the compiler accepts my code.
While in the second one: std::views::lazy_split
for (std::span<const char> word : str | std::views::lazy_split(' '))
throws compilation errors.
I know there will be differences between these two, but I can't easily spot them. Is this a defect report in C++20 or a new feature in C++23 (with changes), or both?
Upvotes: 10
Views: 3051
Reputation: 9825
I've looked at the relevant paper (P2210R2 from Barry Revzin) and split_view
has been renamed to lazy_split_view
. The new split_view
is different in that it provides you with a different result type that preserves the category of the source range.
For example, our string str
is a contiguous range, so split
will yield a contiguous subrange. Previously it would only give you a forward range. This can be bad if you try to do multi-pass operations or get the address to the underlying storage.
From the example of the paper:
std::string str = "1.2.3.4";
auto ints = str
| std::views::split('.')
| std::views::transform([](auto v){
int i = 0;
std::from_chars(v.data(), v.data() + v.size(), i);
return i;
});
will work now, but
std::string str = "1.2.3.4";
auto ints = str
| std::views::lazy_split('.')
| std::views::transform([](auto v){
int i = 0;
// v.data() doesn't exist
std::from_chars(v.data(), v.data() + v.size(), i);
return i;
});
won't because the range v
is only a forward range, which doesn't provide a data()
member.
I was under the impression that split
must be lazy as well (laziness was one of the selling points of the ranges proposal after all), so I made a little experiment:
struct CallCount{
int i = 0;
auto operator()(auto c) {
i++;
return c;
}
~CallCount(){
if (i > 0) // there are a lot of copies made when the range is constructed
std::cout << "number of calls: " << i << "\n";
}
};
int main() {
std::string str = "1 3 5 7 9 1";
std::cout << "split_view:\n";
for (auto word : str | std::views::transform(CallCount{}) | std::views::split(' ') | std::views::take(2)) {
}
std::cout << "lazy_split_view:\n";
for (auto word : str | std::views::transform(CallCount{}) | std::views::lazy_split(' ') | std::views::take(2)) {
}
}
This code prints (note that the transform
operates on each char in the string):
split_view:
number of calls: 6
lazy_split_view:
number of calls: 4
So what happens?
Indeed, both views are lazy. But there are differences in their laziness. The transform
that I put in front of split
just counts how many times it has been called. As it turns out split
computes the next item eagerly, while lazy_split
stops as soon as it hits the whitespace after the current item.
You can see that the string str
consists of numbers that also mark their char index (starting at 1). The take(2)
should stop the loop after we've seen '3' in str
. And indeed lazy_split
stops at the whitespace after '3', but split
stops at the whitespace after '5'.
This esentially means that split
fetches its next item eagerly instead of lazy. This difference probably shouldn't matter most of the time but it can impact performance critical code.
I don't know whether that was the reason for this change (I haven't read the paper).
Upvotes: 13