Reputation: 48198
It can be useful to iterate over multiple variables at once, overlapping (slice::windows
), or not (slice::chunks
).
This only works for slices; is it possible to do this for iterators, using tuples for convenience?
Something like the following could be written:
for (prev, next) in some_iter.windows(2) {
...
}
If not, could it be implemented as a trait on existing iterators?
Upvotes: 57
Views: 47236
Reputation: 431809
It's possible to take chunks of an iterator using Itertools::tuples
, up to a 12-tuple:
use itertools::Itertools; // 0.10.0
fn main() {
let some_iter = vec![1, 2, 3, 4, 5, 6].into_iter();
for (prev, next) in some_iter.tuples() {
println!("{}--{}", prev, next);
}
}
1--2
3--4
5--6
If you don't know that your iterator exactly fits into the chunks, you can use Tuples::into_buffer
to access any leftovers:
use itertools::Itertools; // 0.10.0
fn main() {
let some_iter = vec![1, 2, 3, 4, 5].into_iter();
let mut t = some_iter.tuples();
for (prev, next) in t.by_ref() {
println!("{}--{}", prev, next);
}
for leftover in t.into_buffer() {
println!("{}", leftover);
}
}
1--2
3--4
5
It's also possible to take up to 12-tuple windows with Itertools::tuple_windows
:
use itertools::Itertools; // 0.10.0
fn main() {
let some_iter = vec![1, 2, 3, 4, 5, 6].into_iter();
for (prev, next) in some_iter.tuple_windows() {
println!("{}--{}", prev, next);
}
}
1--2
2--3
3--4
4--5
5--6
If you need to get partial chunks / windows, you can get
Upvotes: 51
Reputation: 71440
This is possible using the standard library only with arrays (not tuples but similar and even better convenience), using the Iterator::array_chunks()
and Iterator::map_windows()
functions. Note there isn't a general array_windows()
function, because such function won't be zero-cost (it will have to clone the elements). Instead, you are provided with a reference to an array of elements, and if you want the behavior of array_windows()
you can just clone it.
Upvotes: 0
Reputation: 5459
Since Rust 1.51 this is possible with const generics where the iterator yields constant size arrays [T; N]
for any N
.
I built the itermore
crate which implements both of these, providing the array_chunks()
and array_windows()
methods under separate extension traits.
use itermore::IterArrayChunks; // 0.7
for [a, b, c] in some_iter.by_ref().array_chunks() {
...
}
let rem = some_iter.into_remainder();
use itermore::IterArrayWindows; // 0.7
for [prev, next] in some_iter.array_windows() {
...
}
Using the example given in the Itertools
answer:
use itermore::IterArrayChunks; // 0.7
fn main() {
let some_iter = vec![1, 2, 3, 4, 5, 6].into_iter();
for [prev, next] in some_iter.array_chunks() {
println!("{}--{}", prev, next);
}
}
This outputs
1--2
3--4
5--6
Most times the array size can be inferred but you can also specific it explicitly. Additionally, any reasonable size N
can be used, there is no limit like in the Itertools
case.
use itermore::IterArrayWindows; // 0.7
fn main() {
let mut iter = vec![1, 2, 3, 4, 5, 6].into_iter().array_windows::<5>();
println!("{:?}", iter.next());
println!("{:?}", iter.next());
println!("{:?}", iter.next());
}
This outputs
Some([1, 2, 3, 4, 5])
Some([2, 3, 4, 5, 6])
None
Note: array_windows()
uses clone to yield elements multiple times so its best used for references and cheap to copy types.
The chunks version is now available on nightly under the name array_chunks
#![feature(iter_array_chunks)]
for [a, b, c] in some_iter.array_chunks() {
...
}
And it handles remainders nicely:
#![feature(iter_array_chunks)]
for [a, b, c] in some_iter.by_ref().array_chunks() {
...
}
let rem = some_iter.into_remainder();
The iterator versions are particularly useful if you are not iterating over some contiguous collection. However, depending on your use-case you might find that collecting into a Vec
first and using the slice methods might be faster even including the time allocate the iterator into a Vec
. This is particularly true in the case of array_windows
where the elements need to be cloned.
Upvotes: 20
Reputation: 300209
TL;DR: The best way to have chunks
and windows
on an arbitrary iterator/collection is to first collect
it into a Vec
and iterate over that.
The exact syntax requested is impossible in Rust.
The issue is that in Rust, a function's signature is depending on types, not values, and while Dependent Typing exists, there are few languages that implement it (it's hard).
This is why chunks
and windows
return a sub-slice by the way; the number of elements in a &[T]
is not part of the type and therefore can be decided at run-time.
Let's pretend you asked for: for slice in some_iter.windows(2)
instead then.
Where would the storage backing this slice live?
It cannot live:
LinkedList
doesn't have a contiguous storage Iterator::Item
, there is no lifetime availableSo, unfortunately, slices can only be used when the backing storage is a slice.
If dynamic allocations are accepted, then it is possible to use Vec<Iterator::Item>
as the Item
of the chunking iterator.
struct Chunks<I: Iterator> {
elements: Vec<<I as Iterator>::Item>,
underlying: I,
}
impl<I: Iterator> Chunks<I> {
fn new(iterator: I, size: usize) -> Chunks<I> {
assert!(size > 0);
let mut result = Chunks {
underlying: iterator, elements: Vec::with_capacity(size)
};
result.refill(size);
result
}
fn refill(&mut self, size: usize) {
assert!(self.elements.is_empty());
for _ in 0..size {
match self.underlying.next() {
Some(item) => self.elements.push(item),
None => break,
}
}
}
}
impl<I: Iterator> Iterator for Chunks<I> {
type Item = Vec<<I as Iterator>::Item>;
fn next(&mut self) -> Option<Self::Item> {
if self.elements.is_empty() {
return None;
}
let new_elements = Vec::with_capacity(self.elements.len());
let result = std::mem::replace(&mut self.elements, new_elements);
self.refill(result.len());
Some(result)
}
}
fn main() {
let v = vec!(1, 2, 3, 4, 5);
for slice in Chunks::new(v.iter(), 2) {
println!("{:?}", slice);
}
}
Will return:
[1, 2] [3, 4] [5]
The canny reader will realize that I surreptitiously switched from windows
to chunks
.
windows
is more difficult, because it returns the same element multiple times which require that the element be Clone
. Also, since it needs returning a full Vec
each time, it will need internally to keep a Vec<Vec<Iterator::Item>>
.
This is left as an exercise to the reader.
Finally, a note on performance: all those allocations are gonna hurt (especially in the windows
case).
The best allocation strategy is generally to allocate a single chunk of memory and then live off that (unless the amount is really massive, in which case streaming is required).
It's called collect::<Vec<_>>()
in Rust.
And since the Vec
has a chunks
and windows
methods (by virtue of implementing Deref<Target=[T]>
), you can then use that instead:
for slice in v.iter().collect::<Vec<_>>().chunks(2) {
println!("{:?}", slice);
}
for slice in v.iter().collect::<Vec<_>>().windows(2) {
println!("{:?}", slice);
}
Sometimes the best solutions are the simplest.
Upvotes: 36