user293895
user293895

Reputation: 1527

Select the latest event with a distinct 2 columns in BigQuery

I have a BigQuery table with a schema like so:

{
  {"name": "timeCreated", "type": "datetime"},
  {"name": "userid", "type": "string"},
  {"name": "textid", "type": "string"},
  {"name": "textvalue": "type": "float"}
}

I am trying to make a query so I end up with the row of the latest timeCreated for each pair of userid and textid combinations. I have tried GROUP BY et al but I cannot seem to get the ORDER BY the timeCreated field then remove all the rows that are not at the top for each pair of userid and textid columns.

Upvotes: 2

Views: 702

Answers (3)

Francis Lan
Francis Lan

Reputation: 3

To add to the accepted answer, you can use MAX instead of ARRAY_AGG.

SELECT 
  userid,
  textid,
  MAX(timeCreated) AS latest
FROM test_table
GROUP BY userid, textid

Upvotes: 0

Sergey Geron
Sergey Geron

Reputation: 10162

To get the latest(last) or earliest(first) element of a group in Google BigQuery you can use ARRAY_AGG with [OFFSET(0)] and appropriate ORDER BY (DESC or ASC):

WITH test_table AS (
  SELECT DATETIME '2020-11-01 01:00:00' AS timeCreated, 'user1' AS userid, 'text1' AS textid, 1.1 AS textvalue UNION ALL
  SELECT DATETIME '2020-11-01 03:00:00' AS timeCreated, 'user1' AS userid, 'text1' AS textid, 1.2 AS textvalue UNION ALL
  SELECT DATETIME '2020-11-01 02:00:00' AS timeCreated, 'user1' AS userid, 'text1' AS textid, 1.3 AS textvalue UNION ALL
  SELECT DATETIME '2020-11-01 02:00:00' AS timeCreated, 'user1' AS userid, 'text2' AS textid, 1.4 AS textvalue UNION ALL
  SELECT DATETIME '2020-11-01 01:00:00' AS timeCreated, 'user1' AS userid, 'text2' AS textid, 1.5 AS textvalue UNION ALL
  SELECT DATETIME '2020-11-01 00:00:00' AS timeCreated, 'user2' AS userid, 'text1' AS textid, 1.6 AS textvalue
)
SELECT 
  userid,
  textid,
  ARRAY_AGG(timeCreated ORDER BY timeCreated DESC LIMIT 1)[OFFSET(0)] AS latest FROM test_table
GROUP BY userid, textid

enter image description here

Upvotes: 4

Mikhail Berlyant
Mikhail Berlyant

Reputation: 172994

Below is for BigQuery Standard SQL

#standardSQL
select as value array_agg(t order by timeCreated desc limit 1)[offset(0)]
from `project.dataset.table` t
group by userid, textid

Upvotes: 1

Related Questions