Arrow
Arrow

Reputation: 11

Erlang - how to convert \u0000 character to binary?

I have a problem with converting unicode character to binary.

Code:

Text = "\u0000partner\u0000"
Bin = term_to_binary(Text, [compressed, {minor_version,1}]),

Result:

<<131,107,0,17,117,48,48,48,48,112,97,114,116,110,101,114,117,48,48,48,48>>

but when I receive data from external service I see the payload has:

<<0,112,97,114,116,110,101,114,0>>

It means that one time \u0000 is converted to <<0>> one time to <<131,107,0,17,117,48,48,48,48>> as a first character in sentence and 117,48,48,48,48 and the end of the sentence.

The question is: how is it possible to convert <<0,112,97,114,116,110,101,114,0>> to "\u0000partner\u0000" or convert this string to <<0,112,97,114,116,110,101,114,0>>

Upvotes: 1

Views: 878

Answers (2)

legoscia
legoscia

Reputation: 41527

As described in the Escape Sequences section of the Erlang reference manual, Erlang doesn't support the \uXXXX escape format, only \xXX (exactly two digits) and \x{XXXX} (variable number of digits).

As for your question:

It means that one time \u0000 is converted to <<0>> one time to <<131,107,0,17,117,48,48,48,48>> as a first character in sentence and 117,48,48,48,48 and the end of the sentence.

What's happening here is that term_to_binary creates a binary in the External Term Format. The external term format always starts with a 131 byte, followed by a type byte. 107 is the type byte for a string, whose representation starts with a two-byte big-endian length - so the 0,17 here means that the length of the string is 17 bytes. 117,48,48,48,48 stands for u0000. \u is an unknown escape sequence, so it just becomes u, and the backslash is ignored.

So if you want to get exactly <<0,112,97,114,116,110,101,114,0>>, you probably want list_to_binary, or perhaps unicode:characters_to_binary if you might have Unicode characters in your string:

> Text = "\x{0000}partner\x{0000}".
[0,112,97,114,116,110,101,114,0]
> list_to_binary(Text).
<<0,112,97,114,116,110,101,114,0>>
> unicode:characters_to_binary(Text).
<<0,112,97,114,116,110,101,114,0>>

Alternatively, skip the string and create the binary straight away:

> Bin = <<"\x{0000}partner\x{0000}">>.     
<<0,112,97,114,116,110,101,114,0>>

Upvotes: 2

choroba
choroba

Reputation: 241758

Erlang doesn't support the \u escape. Use \x00 instead.

Text = "\x00partner\x00".
[0,112,97,114,116,110,101,114,0]
Bin = term_to_binary(Text, [compressed, {minor_version,1}]).
<<131,107,0,9,0,112,97,114,116,110,101,114,0>>

Upvotes: 1

Related Questions