Reputation: 3115
Let's assume that we have a table in BigQuery with the following data:
ColA | ColB | ColC | ColD
x | y | 1 | 5
x | y | 0 | 6
k | z | 1 | 4
ColA, ColB are String, ColC is Integer and ColD timestamp.
I need to get the latest row (LAST(ColD)
) for each unique value of ColA only if ColC is equal to 1. That is, in the above table, the result of the query should be:
ColA | ColB | ColC | ColD
k | z | 1 | 4
Rows 1 and 2 should not be included in the results, since for ColA=x
the last entry in the table (Row 2) had ColC=0
.
I have tried various queries using aggregate functions (e.g. MAX, LAST) in combination with subqueries but was not able to reach to the desired result.
Upvotes: 2
Views: 4451
Reputation: 208022
The proper name for the analytic function in SQL Standard is FIRST_VALUE
. FIRST
is aggregation function in BigQuery
So first you obtain the last ColC for each ColA ordered by the timestamp
SELECT
ColA,
ColB,
ColC,
time,
FIRST_VALUE(ColC) OVER (PARTITION BY ColA ORDER BY time DESC) AS last_Col_C
FROM (
SELECT
'x' AS ColA,
'y' AS ColB,
1 AS ColC,
5 AS time),
(
SELECT
'x' AS ColA,
'y' AS ColB,
0 AS ColC,
6 AS time),
(
SELECT
'k' AS ColA,
'z' AS ColB,
1 AS ColC,
4 AS time)
Then you further select what interests you, final query looks like:
SELECT
ColA,
ColB,
ColC,
time
FROM (
--the above query here--)
where last_Col_C=1
Upvotes: 5