shantanuo
shantanuo

Reputation: 32306

atomic inserts in big query

When I load more than 1 csv file, how does big query handles the errors?

bq load --max_bad_record=30 dbname.finalsep20xyz gs://sep20new/abc.csv.gz,gs://sep20new/xyzcsv.gz

There are a few files in the batch job they may fail to load since the number of expected columns will not match. I want to load the rest of the files though. If the file abc.csv fails Will the xyz.csv file be executed? Or will the entire job fail and no record will be inserted?

I tried with dummy records but could not conclusively find how the errors in multiple files are handled.

Upvotes: 2

Views: 1633

Answers (1)

Jordan Tigani
Jordan Tigani

Reputation: 26617

Loads are atomic -- either all files commit or no files do. You can break the loads up into multiple jobs if you want them to complete independently. An alternative would be to set max_bad_records to something much higher.

We would still prefer that you launch fewer jobs with more files, since we have more flexibility in how we handle the imports. That said, recent changes to load quotas mean that you can submit more simultaneous load jobs, and still higher quotas are planned soon.

Also please note that all BigQuery actions that modify BQ state (load, copy, query with a destination table) are atomic; the only job type that isn't atomic is extract, since there is a chance that it might fail after having written out some of the exported data.

Upvotes: 4

Related Questions