jadi
jadi

Reputation: 15

Get latest balance by day per user via SQL (Bigquery)

I have a sample table below:

+---------------------+---------+--------+
| timestamp           | id      | balance|
|---------------------+---------+--------|
| 2022-08-01 00:00:00 | 1       | 0.01   |
| 2022-08-01 00:00:00 | 2       | 0      | 
| 2022-08-02 00:00:00 | 2       | 0.005  |
| 2022-08-02 07:00:00 | 2       | 0.5    |
| 2022-08-02 07:00:00 | 1       | 0.15   |
| 2022-08-03 00:00:00 | 1       | 0.02   |
| 2021-08-03 01:00:00 | 1       | 0.03   |
+---------------------+---------+--------+

And I want the following output:

+---------------------+---------+--------+
| timestamp           | id      | balance|
|---------------------+---------+--------|
| 2022-08-01 00:00:00 | 1       | 0.01   |
| 2022-08-01 00:00:00 | 2       | 0      | 
| 2022-08-02 07:00:00 | 2       | 0.5    |
| 2022-08-02 07:00:00 | 1       | 0.15   |
| 2021-08-03 01:00:00 | 1       | 0.03   |
+---------------------+---------+--------+

I want to get the latest balance for each id, per day. Ideally the sql would use a window function as the source data table is very large (~900GB) and I need something efficient that will scale.

I've spent a lot of time trying to figure this out and keep getting stuck, any help would be greatly appreciated! Thanks in advance

Upvotes: 1

Views: 42

Answers (1)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 172994

Consider below simple approach

select * from your_table
qualify 1 = row_number() over id_day
window id_day as (partition by id, date(timestamp) order by timestamp desc)

Upvotes: 1

Related Questions