Ashok KS
Ashok KS

Reputation: 691

Create current version of BQ table from history

I have a BigQuery table which gets updated with rows by a batch job every one hour. The job appends rows that are modified in an external system and would be in the format below.

Supplier_ID Supplier_Name Supplier_Contact Last_Modified
123 ABC 03 483 394 2023-05-01 12:34:56
124 ABD 02 848 939 2023-05-01 12:34:56
123 ABC 03 483 345 2023-05-02 10:45:05
124 ABD 02 848 837 2023-05-02 10:45:05
123 ABC 03 478 102 2023-05-08 11:09:15

I want to create a current version of the table like below which shows the latest record of all the Suppliers with only one record per Supplier_ID.

Supplier_ID Supplier_Name Supplier_Contact Last_Modified
123 ABC 03 478 102 2023-05-08 11:09:15
124 ABD 02 848 837 2023-05-02 10:45:05

The size if the table would be in millions. What would be the best way to achieve this? I don't want to create a view since the size of the table would keep growing and the systems which would consume this new table needs a faster response time.

Option 1: Run another batch job and create the table having the max of the timestamp for each Supplier_ID. But the drawback would be creating the entire table even if only one record got updated.

Option 2: Creating materialized views. I haven't worked on this till now. But if this is better then I would explore this.

Looking for your suggestions on the best way.

Upvotes: 0

Views: 81

Answers (0)

Related Questions