Reputation: 13940
This example, where intuition expects "foo bar", shows the strange behaviour:
SELECT 'foo'|| '#'::char ||'bar' -- char ok, foo#bar
SELECT 'foo'|| '#' ||'bar' -- literal ok
SELECT 'foo'|| '#'::text ||'bar' -- text ok
SELECT 'foo'|| ' '::char ||'bar' -- STRANGE! LOSTING SPACE!
SELECT ('foo'|| ' '::char ||'bar')='foobar' -- yes, it is true... strange
SELECT 'foo'|| ' '::text ||'bar' -- text OK
SELECT 'foo'|| (' '::char)::text ||'bar' -- char-to-text lost!
SELECT 'foo'|| ' ' ||'bar' -- literal OK
Why does PostgreSQL do that? It is not intuitive, and seems error-prone behaviour.
PS: where does the PostgreSQL guide say (it needs a red alert) something about this?
Upvotes: 2
Views: 178
Reputation: 50200
This is one of many reasons it's often recommended to stick with varchar()
or, in Postgres text
types instead.
SQL standard mandates that CHAR() values are padded with space to fill the remaining bytes. For instance:
'A'::CHAR(5)
Will result in "A "
being stored. Now if have another field of different length, but same content:
'A'::CHAR10 = 'A'::CHAR(5)
We would want this to say TRUE
, right? So the spaces added for padding to fill the CHAR()
have to be trimmed.
This is less of a Postgres question then it is a disk storage or sql standard question. Something has to be written to those bytes on the disk and space
is the standard. Some DB's only trim for comparison or conversion, and others, like Postgres, trim for nearly any function.
Since you are casting a space to a CHAR(1)
(1 being the default length when none is specified, although it doesn't matter for this quesstion) your space gets lost as padding. This is just one of the caveats of using CHAR()
. It's a damned if you do, damned if you don't situation.
Instead cast that thing to a VARCHAR()
or TEXT
as they are nearly always superior types to CHAR()
.
Noted in the Postgresql documentation:
Tip: There is no performance difference among these three types, apart from increased storage space when using the blank-padded type, and a few extra CPU cycles to check the length when storing into a length-constrained column. While character(n) has performance advantages in some other database systems, there is no such advantage in PostgreSQL; in fact character(n) is usually the slowest of the three because of its additional storage costs and slower sorting. In most situations text or character varying should be used instead.
Upvotes: 4