kangcz
kangcz

Reputation: 195

Rust escaped unicode chars to string

I'm querying an API over HTTP I'm getting back JSON data with following

... Dv\\u016fr Kr\\u00e1lov\\u00e9 nad Labem a okol\\u00ed 5\\u00a0km ...". 

This is what I see when I open the same request in Firefox and show raw data and also when I try to println! the output in Rust.

I would like Rust to rather interpret these into proper chars. I've tried following function which I've googled and it works partially but it fails for some chars

    pub fn normalize(json: &str) -> core::result::Result<String, Box<dyn Error>> {
        let replaced : Cow<'_, str> = regex_replace_all!(r#"\\u(.{4})"#, json, |_, num: &str| {
            let num: u32 = u32::from_str_radix(num, 16).unwrap();
            let c: char = std::char::from_u32(num).unwrap();
            c.to_string()
        });
        Ok(replaced.to_string())
    }
Dvůr Králové nad Labem a okolí 5\u{a0}km

What's the proper way to handle such JSON data?

Upvotes: 0

Views: 2586

Answers (1)

Ultrasaurus
Ultrasaurus

Reputation: 3169

It appears you have a json-encoded string. A rust-encoded string for the same data would look like this:

    let s = "Dv\u{016}fr Kr\u{00e1}lov\u{00e9} nad Labem a okol\u{00ed} 5\u{00a0}km";

To covert a json-encoded string you can use serde, like this:

fn main() {
    let json_encoded = "Dv\\u016fr Kr\\u00e1lov\\u00e9 nad Labem a okol\\u00ed 5\\u00a0km";


    let result: Result<String, serde_json::Error> = serde_json::from_str(&format!("\"{}\"", json_encoded));

    match result {
      Err(e) => println!("oops: {}", e),
      Ok(s)  => println!("{}", s)
    }
}

output:

Dvůr Králové nad Labem a okolí 5 km

see playground

also, this related question might be useful: How to correctly parse JSON with Unicode escape sequences?

Upvotes: 2

Related Questions