bash-replacing a number with unicode character using sed

Question

So I have this output generated from printf

Now I want to pipe it and use sed to replace 0's and 1's with unicode character, so I get unicode characters printed instead of binary (011010).

I can do this just copy-pasting the characters themselves, but I want to use values instead like the ones found in unicode table:

    Position: 0x2701
    Decimal: 9985
    Symbol: ✁

How do I use the above values with sed to generate the character?

rici · Accepted Answer

With bash (since v4.2) or zsh, the simple solution is to use the $'...' syntax, which understands C escapes including \u escapes:

$ echo 011010 | sed $'s/1/\u2701/g'
0✁✁0✁0

If you have Gnu sed, you can use escape sequences in the s// command. Gnu sed, unfortunately, does not understand \u unicode escapes, but it does understand \x hex escapes. However, to get it to decode them, you need to make sure that it sees the backslashes. Then you can do the translation in UTF-8, assuming you know the UTF-8 sequence corresponding to the Unicode codepoint:

$ # Quote the argument
$ echo 011010 | sed 's/1/\xE2\x9C\x81/g'
0✁✁0✁0
$ # Or escape the backslashes
$ echo 011010 | sed s/1/\xE2\x9C\x81/g
0✁✁0✁0
$ # This doesn't work because the \ is removed by bash before sed sees it
$ echo 011010 | sed s/1/\xE2\x9C\x81/g
0xE2x9Cx81xE2x9Cx810xE2x9Cx810
$ # So that was the same as: sed s/1/xE2x9Cx81/g

bash-replacing a number with unicode character using sed

Answers (1)

Related Questions