Reputation: 11
I have a problem with converting unicode character to binary.
Code:
Text = "\u0000partner\u0000"
Bin = term_to_binary(Text, [compressed, {minor_version,1}]),
Result:
<<131,107,0,17,117,48,48,48,48,112,97,114,116,110,101,114,117,48,48,48,48>>
but when I receive data from external service I see the payload has:
<<0,112,97,114,116,110,101,114,0>>
It means that one time \u0000 is converted to <<0>>
one time to <<131,107,0,17,117,48,48,48,48>>
as a first character in sentence and 117,48,48,48,48
and the end of the sentence.
The question is: how is it possible to convert <<0,112,97,114,116,110,101,114,0>>
to "\u0000partner\u0000" or convert this string to <<0,112,97,114,116,110,101,114,0>>
Upvotes: 1
Views: 878
Reputation: 41527
As described in the Escape Sequences section of the Erlang reference manual, Erlang doesn't support the \uXXXX
escape format, only \xXX
(exactly two digits) and \x{XXXX}
(variable number of digits).
As for your question:
It means that one time \u0000 is converted to
<<0>>
one time to<<131,107,0,17,117,48,48,48,48>>
as a first character in sentence and117,48,48,48,48
and the end of the sentence.
What's happening here is that term_to_binary
creates a binary in the External Term Format. The external term format always starts with a 131 byte, followed by a type byte. 107 is the type byte for a string, whose representation starts with a two-byte big-endian length - so the 0,17
here means that the length of the string is 17 bytes. 117,48,48,48,48
stands for u0000
. \u
is an unknown escape sequence, so it just becomes u
, and the backslash is ignored.
So if you want to get exactly <<0,112,97,114,116,110,101,114,0>>
, you probably want list_to_binary
, or perhaps unicode:characters_to_binary
if you might have Unicode characters in your string:
> Text = "\x{0000}partner\x{0000}".
[0,112,97,114,116,110,101,114,0]
> list_to_binary(Text).
<<0,112,97,114,116,110,101,114,0>>
> unicode:characters_to_binary(Text).
<<0,112,97,114,116,110,101,114,0>>
Alternatively, skip the string and create the binary straight away:
> Bin = <<"\x{0000}partner\x{0000}">>.
<<0,112,97,114,116,110,101,114,0>>
Upvotes: 2
Reputation: 241758
Erlang doesn't support the \u
escape. Use \x00
instead.
Text = "\x00partner\x00".
[0,112,97,114,116,110,101,114,0]
Bin = term_to_binary(Text, [compressed, {minor_version,1}]).
<<131,107,0,9,0,112,97,114,116,110,101,114,0>>
Upvotes: 1