Kvazios
Kvazios

Reputation: 126

Replace double backslash in Dart

I have this escaped string:

\u0414\u043B\u044F \u043F\u0440\u043E\u0434\u0430\u0436\u0438 \u043D\u0435\u0434\u0432\u0438\u0436\u0438\u043C\u043E\u0441\u0442\u0438

If I do:

print('\u0414\u043B\u044F \u043F\u0440\u043E\u0434\u0430\u0436\u0438 \u043D\u0435\u0434\u0432\u0438\u0436\u0438\u043C\u043E\u0441\u0442\u0438');

Console will show me:

Для продажи недвижимости

But if I get escaped 2 times string from the server:

\\u0414\\u043B\\u044F \\u043F\\u0440\\u043E\\u0434\\u0430\\u0436\\u0438 \\u043D\\u0435\\u0434\\u0432\\u0438\\u0436\\u0438\\u043C\\u043E\\u0441\\u0442\\u0438

And do some replace job:

var result = string.replaceAll(new RegExp(r'\\'), r'\');

Compiler will not decode those characters and will show same escaped string:

print(result);

Console:

\u0414\u043B\u044F \u043F\u0440\u043E\u0434\u0430\u0436\u0438 \u043D\u0435\u0434\u0432\u0438\u0436\u0438\u043C\u043E\u0441\u0442\u0438

How I can remove those redunant slashes?

Upvotes: 3

Views: 3088

Answers (1)

cbracken
cbracken

Reputation: 3800

In string literals in Dart source files, \u0414 is a literal representing a unicode code point, whereas in the case of data returned from the server, you're just getting back a string containing backslashes, us, and digits that looks like a bunch of unicode code point literals.

The ideal fix is to have your server return the UTF-8 string you'd like to display rather than a string that uses Dart's string literal syntax that you need to parse. Writing a proper parser for such strings is fairly involved. You can take a look at unescapeCodeUnits in the Dart SDK for an example.

A very inefficient (not to mention entirely hacky and unsafe for real-world use) means of decoding this particular string would be to extract the string representations of the unicode codepoints with a RegExp parse the hex to an int, then use String.fromCharCode().

Note: the following code is absolutely not safe for production use and doesn't match other valid Dart code point literals such as \u{1f601}, or reject entirely invalid literals such as \uffffffffff.

// Match \u0123 substrings (note this will match invalid codepoints such as \u123456789).
final RegExp r = RegExp(r'\\\\u([0-9a-fA-F]+)');

// Sample string to parse.
final String source = r'\\u0414\\u043B\\u044F \\u043F\\u0440\\u043E\\u0434\\u0430\\u0436\\u0438 \\u043D\\u0435\\u0434\\u0432\\u0438\\u0436\\u0438\\u043C\\u043E\\u0441\\u0442\\u0438';

// Replace each \u0123 with the decoded codepoint.
final String decoded = source.replaceAllMapped(r, (Match m) {
  // Extract the parenthesised hex string. '\\u0123' -> '123'.
  final String hexString = m.group(1);

  // Parse the hex string to an int.
  final int codepoint = int.parse(hexString, radix: 16);

  // Convert codepoint to string.
  return String.fromCharCode(codepoint);
});

Upvotes: 6

Related Questions