sam
sam

Reputation: 21

No such container (using worker_harness_container_image)

I'm trying to run an Apache Beam Job on Google Cloud Dataflow (Job-ID: 2020-06-08_23_39_43-14062032727466654144) using the flags

--experiment=beam_fn_api 
--worker_harness_container_image=gcr.io/PROJECT_NAME/apachebeamp3.7_imageconversion:latest

Unfortunately, the job ist stuck in the starting state. The job with the exact same configuration ran in the beginning of this year (February?) and I'm wondering what has changed since and what changes are needed on my side to get it running again.

If I run the job locally with

--runner=PortableRunner \
--job_endpoint=embed \
--environment_config=PROJECT_NAME/apachebeamp3.7_imageconversion:latest

it runs perfectly.

In the Dataflow logs, i see the following error messages:

getPodContainerStatuses for pod "dataflow-beamapp-sam-0609063936-65-06082339-h464-harness-zzpb_default(a65b24a783afd25920bf29ff27d7baf8)" failed: rpc error: code = Unknown desc = Error: No such container: 586554fec1cf2942c7d2f45589db02b217c90c2ea96982041fc3f12b4b6595ff" 

and

ContainerStatus "1647b951d266b4b1d318317b1836002eb4731a510dffa38ba6b58b45a7710784" from runtime service failed: rpc error: code = Unknown desc = Error: No such container: 1647b951d266b4b1d318317b1836002eb4731a510dffa38ba6b58b45a7710784

I'm a bit puzzled regarding the container ID since gcr.io/PROJECT_NAME/apachebeamp3.7_imageconversion:latest has currently 8bdf43f9cdcd20d4c258a7810c81cb5214ecc984e534117ef8ba1a4cab2a3dae.

Questions:

Edit Additional information based on question below:

Thanks for the pointers. I have looked at the dataflow.googleapis.com/kubelet logs. The only errors I see there are

Strangely, I do not see a category worker-startup in the log viewer. What do I need to do to see those log entries and to be able to make the next step on this debugging journey :-)?

Upvotes: 1

Views: 1905

Answers (3)

Dyjah
Dyjah

Reputation: 101

For me the problem was fixed when I removed the option --experiments=use_runner_v2 when running the pipeline

Upvotes: 0

Rohith Uppala
Rohith Uppala

Reputation: 46

I am having a similar issue, getting Container Status xxxxx service failed and Error Syncing pod

I am trying to read data from the file and process it for a streaming application. Once I removed options.setStreaming(true) it is working properly.

Streaming is for unbounded data like reading from PubSub, Kafka and batching is for bounded data reading from database or file.

Upvotes: 0

sam
sam

Reputation: 21

Turns out I made multiple mistakes:

  • In my Dockerfile, I needed to change FROM apachebeam/python3.7_sdk:latest to FROM apache/beam_python3.7_sdk:latest. According to https://hub.docker.com/r/apachebeam/python3.7_sdk, there has been a switch from version 2.20.0 onwards.
  • My Dockerfile didn't use the correct version of the Python beam package.

Upvotes: 1

Related Questions