Reputation: 584
Is there a way to update rows in Google BigQuery when publishing from Cloud Dataprep? I can't find anything in the documentation.
I have a dataset I'm preprocessing with Dataprep that contains new rows and updated rows on every (daily) run. I would like to incrementally write it to BigQuery adding new rows and updating existing changed rows.
Upvotes: 2
Views: 267
Reputation: 126
So Trifacta (CDP) can either append, replace or create new file each time the Job is run. You're refering to an Upsert\Merge action. So this can be done by 2 steps.
Either - A. do a Union transformation with the output file (already in bigquery), then deduplicate\aggregate and choose how you'd like to merge between the duped rows. this then can be done each time a new file is made\created\updated. B. create a new file, then through parameterization import all files within the folder - unioning them and do the grouping mentioned in A.
Obviously - option A seems more simplistic.
Does that make sense?
Upvotes: 2