not_rogue
not_rogue

Reputation: 33

Dataflow insert into BigQuery fails with large number of files for asia-northeast1 location

I am using Cloud Storage Text to BigQuery template on Cloud Composer.

The template is kicked from Python google api client.

The same program

Does anybody have an idea about this? I want to execute in the asia-northeast location for business reason.


More details about failure:

The program worked until "ReifyRenameInput", and the failed .

dataflow job failed

with the error message below:

java.io.IOException: Unable to insert job: beam_load_textiotobigquerydataflow0releaser0806214711ca282fc3_8fca2422ccd74649b984a625f246295c_2a18c21953c26c4d4da2f8f0850da0d2_00000-0, aborting after 9 . 

at org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$JobServiceImpl.startJob(BigQueryServicesImpl.java:231)
 at org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$JobServiceImpl.startJob(BigQueryServicesImpl.java:202)
 at org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$JobServiceImpl.startCopyJob(BigQueryServicesImpl.java:196)
 at org.apache.beam.sdk.io.gcp.bigquery.WriteRename.copy(WriteRename.java:144) 
at org.apache.beam.sdk.io.gcp.bigquery.WriteRename.writeRename(WriteRename.java:107) 
at org.apache.beam.sdk.io.gcp.bigquery.WriteRename.processElement(WriteRename.java:80)
 Caused by: com.google.api.client.googleapis.json.GoogleJsonResponseException: 
404 Not Found { "code" : 404, "errors" : [ { "domain" : "global", "message" : "Not found: Dataset pj:datasetname", "reason" : "notFound" } ], "message" : "Not found: Dataset pj:datasetname" }

(pj and dataset name are not real name, and they are project name and dataset name for outputTable parameter)

Although the error message said the dataset is not found, the dataset surely existed.

Moreover, some new tables which seems to be tempory tables were created in the dataset after the program.

Upvotes: 0

Views: 331

Answers (1)

F10
F10

Reputation: 2893

This is a known issue related to your Beam SDK version according to this public issue tracker. The Beam 2.5.0 SDK version doesn't have this issue.

Upvotes: 1

Related Questions