Reputation: 65
My situation is:
I want to schedule regular data updates from Cloud SQL into BigQuery.
The data of the table on Cloud SQL is updated regularly and can edit the old data in that table. There are about 20 columns in this table.
When it comes to the update schedule I want to sync between Tables in Cloud SQL and BigQuery.
How can I add new data, update recently edited data and delete data that no longer exist in CLoud SQL in BigQuery?
Currently I use the way to override this table every time I go to the update schedule.
I still do not have a really good solution to save data when querying.
Upvotes: 0
Views: 980
Reputation: 2612
For this what you can do is to set Cloud SQL as an external Data Source on Big Query.
This way the data will keep updated automatically in Big Query as it resides in cloud SQL. This will actually lower your billing amounts as data won't be duplicated however the queries done to an external data source are a bit slower than when data resides in BigQuery.
To do this you can follow the guide here and I'm coping the general steps on this post.
project-id:location-id:instance-id
This will link the data in your cloud SQL instance to Big Query so you can use the BigQuery motor to analyze data in your cloud SQL instance without needing to copy data to BigQuery and making sure it is always updayted.
Upvotes: 1