Reputation: 867
I'm brand new to timescale and can do this in SQL fine but I'm not sure if there is a better way to do this in a timescaledb.
I have the following table
| DATE | MERCHANT_ID | COUNTRY_CODE | PRODUCT_ID | INVENTORY |
|---------------------|-------------|--------------|------------|-----------|
| 2024-03-20 08:00:00 | 1 | US | 1 | 150 |
| 2024-03-20 08:00:00 | 1 | US | 2 | 220 |
| 2024-03-20 08:00:00 | 1 | US | 3 | 380 |
| 2024-03-20 08:00:00 | 1 | US | 4 | 400 |
| 2024-03-20 09:00:00 | 1 | US | 1 | 180 |
| 2024-03-20 09:00:00 | 1 | US | 2 | 200 |
| 2024-03-20 09:00:00 | 1 | US | 3 | 310 |
| 2024-03-20 09:00:00 | 1 | US | 4 | 440 |
| 2024-03-20 10:00:00 | 1 | US | 1 | 120 |
| 2024-03-20 10:00:00 | 1 | US | 2 | 240 |
| 2024-03-20 10:00:00 | 1 | US | 3 | 360 |
| 2024-03-20 10:00:00 | 1 | US | 4 | 410 |
| 2024-03-20 11:00:00 | 1 | US | 1 | 130 |
| 2024-03-20 11:00:00 | 1 | US | 2 | 260 |
| 2024-03-20 11:00:00 | 1 | US | 3 | 390 |
| 2024-03-20 11:00:00 | 1 | US | 4 | 520 |
| 2024-03-20 08:00:00 | 2 | UK | 1 | 150 |
| 2024-03-20 08:00:00 | 2 | UK | 2 | 250 |
| 2024-03-20 08:00:00 | 2 | UK | 3 | 350 |
| 2024-03-20 08:00:00 | 2 | UK | 4 | 450 |
| 2024-03-20 09:00:00 | 2 | UK | 1 | 160 |
| 2024-03-20 09:00:00 | 2 | UK | 2 | 270 |
| 2024-03-20 09:00:00 | 2 | UK | 3 | 380 |
| 2024-03-20 09:00:00 | 2 | UK | 4 | 490 |
| 2024-03-20 10:00:00 | 2 | UK | 1 | 170 |
| 2024-03-20 10:00:00 | 2 | UK | 2 | 280 |
| 2024-03-20 10:00:00 | 2 | UK | 3 | 390 |
| 2024-03-20 10:00:00 | 2 | UK | 4 | 500 |
| 2024-03-20 11:00:00 | 2 | UK | 1 | 180 |
| 2024-03-20 11:00:00 | 2 | UK | 2 | 290 |
| 2024-03-20 11:00:00 | 2 | UK | 3 | 400 |
| 2024-03-20 11:00:00 | 2 | UK | 4 | 510 |
What I'm trying to do is select the most recent inventory number for a key MERCHANT_ID, COUNTRY_CODE, PRODUCT_ID where PRODUCT_ID is some list. It could be the case where we don't have an inventory record for an undetermined amount of time and this table will likely grow to be in the scale of multi million rows.
So for example from the table above, if I am looking at `MERCHANT_ID=2, COUNTRY_CODE=US, PRODUCT_ID in (1, 2, 3), I would want:
| MERCHANT_ID | COUNTRY_CODE | PRODUCT_ID | INVENTORY |
|-------------|--------------|------------|-----------|
| 2 | US | 1 | 130 |
| 2 | US | 2 | 260 |
| 2 | US | 3 | 390 |
What would the most performant way to get this data? I'm working with a blank slate currently so I only have an index on the date column so I can add indexes or create continuous aggregate materialized views as well if that is better. I will eventually be creating a timeseries for this table but currently I only want to get the most recent value.
Upvotes: 0
Views: 80
Reputation: 1412
First and simplest attempt:
Add a composed index by time, MERCHANT_ID, COUNTRY_CODE, PRODUCT_ID.
If it's not enough, you can build a very lean hypertable which can be time, identifier,value
The value in like inventory
for your case and id
is a composition of other data.
I covered several models in this video if you want learn more: https://www.youtube.com/watch?v=sUgXZwK_sz0
Upvotes: 0