Reputation: 475
I have a problem where I need to get the last item across various tables in PostgreSQL.
The following code works and returns me the type of the latest update and when it was last updated.
The problem is, this query needs to be used as a subquery, so I want to select both the type and the last updated value from this query and PostgreSQL does not seem to like this... (Subquery must return only one column
)
Any suggestions?
SELECT last.type, last.max FROM (
SELECT MAX(a.updated_at), 'a' AS type FROM table_a a WHERE a.ref = 5 UNION
SELECT MAX(b.updated_at), 'b' AS type FROM table_b b WHERE b.ref = 5
) AS last ORDER BY max LIMIT 1
Query is used like this inside of a CTE;
WITH sql_query as (
SELECT id, name, address, (...other columns),
last.type, last.max FROM (
SELECT MAX(a.updated_at), 'a' AS type FROM table_a a WHERE a.ref = 5 UNION
SELECT MAX(b.updated_at), 'b' AS type FROM table_b b WHERE b.ref = 5
) AS last ORDER BY max LIMIT 1
FROM table_c
WHERE table_c.fk_id = 1
)
Upvotes: 4
Views: 9488
Reputation: 69
I know this is old, but have you tried returning a user-defined type array?
In my case, I'm needing to return a row representing the users who created, updated, and own a particular item (using a full name, a nick-name, an employee id, and an email). My query wound up looking something like this:
SELECT r.role AS access, t.name, d.name AS domain,
(SELECT (u.nick_name, u.full_name, u.email, u.id)::t_user FROM users u WHERE id = t.owner_id) AS owner,
(SELECT (u.nick_name, u.full_name, u.email, u.id)::t_user FROM users u WHERE id = t.created_by) AS creator,
(SELECT (u.nick_name, u.full_name, u.email, u.id)::t_user FROM users u WHERE id = t.updated_by) AS updater
FROM task t
JOIN domain d ON t.domain_id = d.id
JOIN (SELECT * FROM getUserRoles(12345678)) r ON r.domain = d.name
;
I did this as a trial in comparison with multiple table joins and was more than pleased with the EXPLAIN ANALYZE
results, which showed a 1/3 decreased complexity and time requirement.
In your case, though, it might be useful to return an array pertaining to each table you need a result from. In my case it would look like this (although much of the meaning is lost for me and without a gain in performance):
SELECT r.role AS access, t.name, d.name AS domain,
(SELECT ARRAY[(u.nick_name, u.full_name, u.email, u.id)::t_user] FROM users u WHERE id IN (t.owner_id, t.created_by, t.updated_by)) AS users
FROM task t
JOIN domain d ON t.domain_id = d.id
JOIN (SELECT * FROM getUserRoles(12345678)) r ON r.domain = d.name
;
So how does that translate for you? Well, first we need a type to hold your types by update date:
CREATE TYPE t_last_updated_type AS (
updated_dt TIMESTAMP,
type VARCHAR
)
Then, we use comma separator notation rather than UNION
to join the tables and utilize the ARRAY
notation with the newly created user-defined TYPE
for the query. Something like this:
SELECT id, name, address, (...other columns),
last.type, last.max FROM (
SELECT ARRAY[(MAX(table_a.updated_at), 'a')::t_last_updated_type,(MAX(table_b.updated_at), 'b')::t_last_updated_type] FROM table_a,table_b WHERE ref = 5
) AS last ORDER BY max LIMIT 1
FROM table_c
WHERE table_c.fk_id = 1
It's important to note that the comma separator notation assumes columns are the same between table_a and table_b, else it may error out.
Upvotes: 0
Reputation: 35583
The inherent problem is that SQL (all SQL not just Postgres) requires that a subquery used within a select clause can only return a single value. If you think about that restriction for a while it does makes sense. The select clause is returning rows and a certain number of columns, each row.column location is a single position within a grid. You can bend that rule a bit by putting concatenations into a single position (or a single "complex type" like a JSON value) but it remains a single position in that grid regardless.
Here however you do want 2 separate columns AND you need to return both columns from the same row, so instead of LIMIT 1
I suggest using ROW_NUMBER()
instead to facilitate this:
WITH LastVals as (
SELECT type
, max_date
, row_number() over(order by max_date DESC) as rn
FROM (
SELECT MAX(a.updated_at) AS max_date, 'a' AS type FROM table_a a WHERE a.ref = 5
UNION ALL
SELECT MAX(b.updated_at) AS max_date, 'b' AS type FROM table_b b WHERE b.ref = 5
)
)
, sql_query as (
SELECT id
, name, address, (...other columns)
, (select type from lastVals where rn = 1) as last_type
, (select max_date from lastVals where rn = 1) as last_date
FROM table_c
WHERE table_c.fk_id = 1
)
By the way in your subquery you should use UNION ALL
with type being a constant like 'a' or 'b' then even if MAX(a.updated_at)
was identical for 2 or more tables, the rows would still be unique because of the difference in type. UNION
will attempt to remove duplicate rows but here it just isn't going to help, so avoid that wasted effort by using UNION ALL
.
For another way to skin this cat, consider using a LEFT JOIN
instead
SELECT id
, name, address, (...other columns)
, lastVals.type
, LastVals.last_date
FROM table_c
WHERE table_c.fk_id = 1
LEFT JOIN (
SELECT type
, last_date
, row_number() over(order by last_date DESC) as rn
FROM (
SELECT MAX(a.updated_at) AS last_date, 'a' AS type FROM table_a a WHERE a.ref = 5
UNION ALL
SELECT MAX(b.updated_at) AS last_date, 'b' AS type FROM table_b b WHERE b.ref = 5
)
) LastVals ON LastVals.rn = 1
Upvotes: 2