Rashida
Rashida

Reputation: 491

MongoDB to BigQuery Dataflow job failure

When I run the dataflow job it takes a lot of time for read documents part and start logging the timedout errors. I got 4 timedout errors and then finally I get a long error message and this part caught my attention:

'If the logs only contain generic timeout errors related to accessing external resources, such as MongoDB, verify that the worker service account has permission to access the resource's subnetwork'

This is running with default service account which has the editor role. So, what other role is necessary? I am pretty new to Dataflow. Please help.

Upvotes: 0

Views: 312

Answers (1)

Dhiraj Singh
Dhiraj Singh

Reputation: 536

The particular error is explained in the documentation, Dataflow batch runner retries all work items. If the same work item fails four times the pipeline will fail.

Dataflow work items can expire when they are unable to send an progress update to Dataflow service for more than two minutes. This could be due to various reasons. For example,

  • Insufficient memory in the Dataflow worker for the progress reporting mechanism to operate properly.

  • Worker running out of memory or swap space.

To resolve this, you can try to increase the memory and disk space of the worker by switching the machine types to --worker_machine_type=’m1-ultramem-40’ --disk_size_gb=5 or you can also use the workers with more memory (for example, n1-highmem-2).

Please note that increasing the memory and disk space could impact billed cost. The Dataflow Python starts up a worker process per VM core, so the amount of memory available will be split between more than one worker process (for example ~ 6GB per worker for n1-highmem-2).

Upvotes: 1

Related Questions