Reputation: 223
I have a Cloud Dataflow job that's stuck in the initiation phase, before running any application logic. I tested this by adding a log output statement inside inside the processElement
step, but it's not appearing in the logs so it seems it's not being reached.
All I can see in the logs are the following messages, this which appears every minute:
logger: Starting supervisor: /etc/supervisor/supervisord_watcher.sh: line 36: /proc//oom_score_adj: Permission denied
And these which loop every few seconds:
VM is healthy? true.
http: TLS handshake error from 172.17.0.1:38335: EOF
Job is in state JOB_STATE_RUNNING, will check again in 30 seconds.
The job ID is 2015-09-14_06_30_22-15275884222662398973
, though I have an additional two jobs (2015-09-14_05_59_30-11021392791304643671
, 2015-09-14_06_08_41-3621035073455045662
) that I started the morning and which have the same problem.
Any ideas on what might be causing this?
Upvotes: 2
Views: 1473
Reputation: 6130
It sounds like your pipeline has a BigQuery source followed by a DoFn
. Before running your DoFn
(and therefore reaching your print statement) the pipeline runs a BigQuery export job to create a snapshot of the data in GCS. This ensures that the pipeline gets a consistent view of the data contained in the BigQuery tables.
It seems like this BigQuery export job for your table took a long time. There unfortunately isn't a progress indicator for the export process. If you run the pipeline again and let it run longer, the export process should complete and then your DoFn
will start running.
We are looking into improving the user experience of the export job as well as figuring out why it took longer than we expected.
Upvotes: 2