stackoverflowuser
stackoverflowuser

Reputation: 15

Slicing string with Nordic letters in rust

What I am trying to do is to slice a string that has Nordic letters but it throws this error:

'byte index 1 is not a char boundary; it is inside 'å' (bytes 0..2) of å'

fn main() {
    let str = "äåö".to_string();
    println!("{}", &str[1..]);
}

Upvotes: 1

Views: 352

Answers (1)

Finomnis
Finomnis

Reputation: 22476

fn main() {
    let str = "äåö".to_string();
    let slice_position = str.char_indices().nth(1).unwrap().0;
    println!("{}", &str[slice_position..]);
}
åö

The problem here is that str's indexing is in bytes, but it is UTF-8 encoded and ä takes more than one byte in UTF-8. So slicing at 1 actually cuts off half a character, which is a runtime error in Rust.

The reason str behaves this way is because you can't actually determine the position of the n-th character without iterating over the entire string. UTF-8 has variable-length characters, meaning, the position of a character depends on the previous characters.

Upvotes: 1

Related Questions