Reputation: 33
In a PostgreSQL table I have several information stored as text. It depends on the context described by a type column what type of information is stored. The application is prepared to get by only one command the Id's of the row.
I got into trouble when i tried to compare the information (bigint stored as a string) with an external value (e.g. '9' > '11'). When I tried to cast the column, the datatbase return an error (not all values in the column are castable, e.g. datetime or normal text). Also when I try to cast only the result of a query command, I get a cast error.
I get the table with the castable rows by this command:
SELECT information.id as id, item.information::bigint as item
FROM information
INNER JOIN item
ON information.id = item.informationid
WHERE information.type = 'task'
The resulting rows are showing up only text that is castable. When I throw it into another command it results in an error.
SELECT x.id FROM (
SELECT information.id as id, item.information::bigint as item
FROM information
INNER JOIN item
ON information.id = item.informationid
WHERE information.type = 'task'
) AS x
WHERE x.item > '0'::bigint
Accroding to the error, the database tried to cast all rows in the table.
Upvotes: 1
Views: 848
Reputation: 1270463
This is unfortunate. Try using a case
expression:
SELECT inf.id as id,
(CASE WHEN inf.type = 'task' THEN i.information::bigint END) as item
FROM information inf JOIN
item i
ON inf.id = i.informationid
WHERE inf.type = 'task';
There is no guarantee that the WHERE
filter is applied before the SELECT
. However, CASE
does guarantee the order of evaluation, so it is safe.
Upvotes: 1
Reputation: 704
Technically, this happens because the optimizer thinks WHERE x.item > '0'::bigint
is a much more efficient filter than information.type = 'task'. So in the table scan, the WHERE x.item > '0'::bigint
condition is chosen to be the predicate. This thinking is not wrong but will make you fall into this seemingly illogical trouble.
The suggestion by Gordon to use CASE WHEN inf.type = 'task' THEN i.information::bigint END
can avoid this, but however it may sometimes ruin your idea to put that as a sub-query and require the same condition to be written twice.
A funny trick I tried is to use OUTER APPLY:
SELECT x.* FROM (SELECT 1 AS dummy) dummy
OUTER APPLY (
SELECT information.id as id, item.information::bigint AS item
FROM information
INNER JOIN item
ON information.id = item.informationid
WHERE information.type = 'task'
) x
WHERE x.item > '0'::bigint
Sorry that I only verified the SQL Server version of this. I understand PostgreSQL has no OUTER APPLY, but the equivalent should be:
SELECT x.* FROM (SELECT 1 AS dummy) dummy
LEFT JOIN LATERAL (
SELECT information.id as id, item.information::bigint AS item
FROM information
INNER JOIN item
ON information.id = item.informationid
WHERE information.type = 'task'
) x ON true
WHERE x.item > '0'::bigint
(reference is this question)
Finally, a more tidy but less flexible method is add the optimizer hint to turn off it to force the optimizer to run the query as how it is written.
Upvotes: 2