Andrey Bienkowski
Andrey Bienkowski

Reputation: 1715

How to iterate prefixes and suffixes of str or String in rust?

I have a string: "abcd" and I want to:

Upvotes: 2

Views: 1262

Answers (1)

Andrey Bienkowski
Andrey Bienkowski

Reputation: 1715

Strings are more complicated then one might expect

  • To match human intuition you usually want to treat a string as a sequence of 0 or more grapheme clusters.
  • A grapheme cluster is a sequence of 1 or more Unicode code points
  • In the utf8 encoding a code point is represented as a sequence of 1, 2, 3 or 4 bytes
  • Both String and str in rust use utf8 to represent strings and indexes are byte offsets
  • Slicing a part of a code point makes no sense and produces garbage data. Rust chooses to panic instead:
#[cfg(test)]
mod tests {
    #[test]
    #[should_panic(expected = "byte index 2 is not a char boundary; it is inside '\\u{306}' (bytes 1..3) of `y̆`")]
    fn bad_index() {
        let y = "y̆";
        &y[2..];
    }
}

A solution

Warning: this code works at the code point level and is grapheme cluster oblivious.

From shortest to longest:

use core::iter;

pub fn prefixes(s: &str) -> impl Iterator<Item = &str> + DoubleEndedIterator {
    s.char_indices()
        .map(move |(pos, _)| &s[..pos])
        .chain(iter::once(s))
}

pub fn suffixes(s: &str) -> impl Iterator<Item = &str> + DoubleEndedIterator {
    s.char_indices()
        .map(move |(pos, _)| &s[pos..])
        .chain(iter::once(""))
        .rev()
}

In reverse:

prefixes(s).rev()
suffixes(s).rev()

test

See also: How to iterate prefixes or suffixes of vec or slice in rust?

Upvotes: 5

Related Questions