Create current version of BQ table from history

Question

I have a BigQuery table which gets updated with rows by a batch job every one hour. The job appends rows that are modified in an external system and would be in the format below.

Supplier_ID	Supplier_Name	Supplier_Contact	Last_Modified
123	ABC	03 483 394	2023-05-01 12:34:56
124	ABD	02 848 939	2023-05-01 12:34:56
123	ABC	03 483 345	2023-05-02 10:45:05
124	ABD	02 848 837	2023-05-02 10:45:05
123	ABC	03 478 102	2023-05-08 11:09:15

I want to create a current version of the table like below which shows the latest record of all the Suppliers with only one record per Supplier_ID.

Supplier_ID	Supplier_Name	Supplier_Contact	Last_Modified
123	ABC	03 478 102	2023-05-08 11:09:15
124	ABD	02 848 837	2023-05-02 10:45:05

The size if the table would be in millions. What would be the best way to achieve this? I don't want to create a view since the size of the table would keep growing and the systems which would consume this new table needs a faster response time.

Option 1: Run another batch job and create the table having the max of the timestamp for each Supplier_ID. But the drawback would be creating the entire table even if only one record got updated.

Option 2: Creating materialized views. I haven't worked on this till now. But if this is better then I would explore this.

Looking for your suggestions on the best way.

Create current version of BQ table from history

Answers (0)

Related Questions