Otaku Kyon
Otaku Kyon

Reputation: 445

BASH: Convert Unicode Hex to String

I have this text file saved on my server, which contains Unicode letters in a hex decimal format, like \u3010 etc. I want to convert them in order to make them human-readable without getting rid of the normal readable text like "Blessed Messiah and the Tower" etc.

\u3010Vocaloid 10\u3011Blessed Messiah and the Tower of AI\u3010Originl MV\u3011
\u3010Otomachi Una\u3011 Hate It! Hate It! Huge Ego!
\u3010Otomachi Una\u3011Melt \u3010Cover\u3011
\u3010GUMI\u3011 \u604b\u611b\u30c7\u30b3\u30ec\u30fc\u30c8 \u3010\u30aa\u30ea\u30b8\u30ca\u30ebMV\u3011

I already tried to run cat FILE | hexdump -v or cat FILE | iconv -f utf16, without any success. I even tried cat FILE | ascii2uni -a U -q, which was working, but there were a few graphical flaws, e.g.

【Otomachi Una】Melt 𰄌over】

How can I encode these characters correctly? I would prefer commands that are built-in in most unix systems.

Upvotes: 1

Views: 4002

Answers (2)

Gilles Quénot
Gilles Quénot

Reputation: 185861

One solution :

printf '%s' "$(<file)"

enter image description here

where file is the name of the file containing your unicode text

Upvotes: 2

chepner
chepner

Reputation: 532478

These are the same literals recognized by echo -e as representing Unicode characters.

$ echo -e "$(<FILE)"
【Vocaloid 10】Blessed Messiah and the Tower of AI【Originl MV】
【Otomachi Una】 Hate It! Hate It! Huge Ego!
【Otomachi Una】Melt 【Cover】
【GUMI】 恋愛デコレート 【オリジナルMV】

Upvotes: 2

Related Questions