Naty Bizz
Naty Bizz

Reputation: 2342

Postgres - decode special characters

I have words like this encoded: "cizaña", the encoded result is 63697A61F161

When I try to convert to 'cizaña' again

select decode('63697A61F161'::text, 'hex')

I get:

"ciza\361a"

What can I do? I tried to set set client_encoding to 'UTF-8'; without luck

Upvotes: 1

Views: 4148

Answers (1)

Craig Ringer
Craig Ringer

Reputation: 325141

the encoded result is 63697A61F161

"encoded" how? I think you care confusing text encoding with representation format of binary data.

63697A61F161 is the iso-8859-1 ("latin-1") encoding of the text "cizaña" with the binary represetned as hex octets.

decode('63697A61F161', 'hex') produces the bytea value '\x63697A61F161' if bytea_encoding is hex or 'ciza\361a' if bytea_encoding is escape. Either way, it's a representation of a binary string, not text.

If you want text, you must decode the text encoding into the current database text encoding with convert_from, e.g.

test=> select convert_from(decode('63697A61F161', 'hex'), 'iso-8859-1');
 convert_from 
--------------
 cizaña
(1 row)

This should help explain:

demo=> select convert_from(BYTEA 'ciza\361a', 'iso-8859-1');
 convert_from 
--------------
 cizaña
(1 row)

See? 'ciza\361a' is an octal-escape representation of the binary data for the iso-8859-1 encoding of the text 'cizaña'. It's the exact same value as the bytea hex-format value '\x63697A61F161':

demo=> select convert_from(BYTEA '\x63697A61F161', 'iso-8859-1');
 convert_from 
--------------
 cizaña
(1 row)

So:

decode and encode transform text-string representations of binary data to and from bytea literals, Postgres binary objects. Which are output in a text form for display.h The encoding/decoding here is one of binary representation e.g. hex or base64.

convert_from and convert_to take binary data and apply text encoding processing to convert it to or from the local native database text encoding, producing text strings. The encoding here is text encoding.

It's ... not easy to follow at first. You might have to learn more about text encodings.

Upvotes: 2

Related Questions