Reputation: 3214
I have a table of records that is growing, and I'd like to be able to append modified records to it. However, I'd like to be able to then have a logical view of all of the "newest" versions of each record (highest modified_date
+ unique primary_key
). I tried a JOIN
against the table with a GROUP BY
primary_key
, but this then requires that the entire table have ORDER BY
modified_date
, which exceeds resources.
Upvotes: 0
Views: 162
Reputation: 33745
There is now a better way of doing this. Here is an example:
WITH T AS (
SELECT x, y, MOD(y, 2) AS z
FROM UNNEST([5, 4, 3, 2]) AS x WITH OFFSET y
)
SELECT
z,
ARRAY_AGG(x ORDER BY y LIMIT 1)[OFFSET(0)] AS top_x
FROM T
GROUP BY z;
This returns the top x
value as determined by some other column, grouped by a third column. The query in the other answer could be expressed as:
WITH my_table AS (
SELECT 1 AS primary_key, "foo" AS value, DATE('2016-11-09') AS modified_date UNION ALL
SELECT 1, "bar", DATE('2016-11-10') UNION ALL
SELECT 2, "baz", DATE('2016-01-01')
)
SELECT
row.*
FROM (
SELECT
ARRAY_AGG(t ORDER BY modified_date DESC LIMIT 1)[OFFSET(0)] AS row
FROM my_table AS t
GROUP BY primary_key
);
This returns the row associated with the most recent modified_date
. In theory, you should just be able to use .*
directly after [OFFSET(0)]
(and not need a subselect), but there appears to be a bug with column resolution that I'm looking into.
Upvotes: 0
Reputation: 3214
You can avoid the resource explosion by specifying PARTITION BY
, which then allows for sorting on a more granular level. This pattern suffices:
SELECT
*
FROM (
SELECT
*,
ROW_NUMBER() OVER(PARTITION BY primary_key ORDER BY modified_date DESC) seq
FROM
my_table)
WHERE
seq = 1;
Upvotes: 1