Nucc
Nucc

Reputation: 1031

String encoding issue in Ruby

In ruby 1.9.3-p484 I have to construct an SMPP package, but when I pass the constructed packet's content in string to the method that delivers it, a strange \xC2 value appears in the content. Having investigated the issue, I found the following interesting gotcha:

"\u008E".force_encoding("BINARY")
 => "\xC2\x8E"

Why does \u00BE become \xC2\8E when I want to use binary encoding? Why not \x00\x8E?

Upvotes: 3

Views: 101

Answers (2)

Малъ Скрылевъ
Малъ Скрылевъ

Reputation: 16507

Because it is just forces text in encoding, and you have seen it as it is stored in memory. And it is stored in memory as an (Multi-Byte Character Set) data. And for chars over \x7F it become at leat two-bytes representation. So you can see:

"\u008E".force_encoding("BINARY")
# => "\xC2\x8E"

Upvotes: 1

Guilherme
Guilherme

Reputation: 1146

this is a binary representation. Take a look:

At Tue, 27 Jul 2010 22:21:31 +0900, Heesob Park wrote in :

I noticed String#inspect results \x{XXXX} for the encoding other than Unicode.

Is there any possibility that \x{XXXX} is accepted as an escape sequence of string?

irb(main):004:0> a = "\xC7\xD1\xB1\xDB"

This is in binary representation.

irb(main):010:0> a1 => "\x{B1DB}"

https://bugs.ruby-lang.org/issues/3619

It's on a codepoint representation.

Upvotes: 1

Related Questions