machazthegamer
machazthegamer

Reputation: 613

Bigquery snapshots when base table gets overwritten

Context

I have an ETL process that keeps overwritng all rows of a table in bq by deleting all first then inserting new ones. I'm looking for a data back up design that can be triggered regularly on that table.

Issue

I'm concerned about the cost implications of using snapshots for this kind of table.

What exactly am I worried about?

On each drop and recreation of the base table, the new data has many rows that are identical to previous row, some new rows and some updated rows. However, the data gets inserted in a different sort order each time.

So when bq is creating a snapshot, by looking for rows that have changed, will it know that that some previous rows are still in the base table and have only changed position in order to avoid increased storage costs on the snapshot?

Upvotes: 0

Views: 536

Answers (1)

Blakey
Blakey

Reputation: 843

Have you thought about using merge statements?

These can deal with inserts, updates and even deletes in one query.

An example here https://querystash.com/query/62cf51097d57d7579954c0d418afc063

Upvotes: 1

Related Questions