Gaurav Thakur
Gaurav Thakur

Reputation: 49

Apache Beam on Cloud Dataflow - Failed to query Cadvisor

I have a cloud dataflow that is reading from a Pub/Sub and pushing data out to BQ. Recently the dataflow is reporting the error below and not writing any data to BQ.

{
 insertId:  "3878608796276796502:822931:0:1075"  
 jsonPayload: {
  line:  "work_service_client.cc:490"   
  message:  "gcpnoelevationcall-01211413-b90e-harness-n1wd Failed to query CAdvisor at URL=<IPAddress>:<PORT>/api/v2.0/stats?count=1, error: INTERNAL: Couldn't connect to server"   
  thread:  "231"   
 }
 labels: {
  compute.googleapis.com/resource_id:  "3878608796276796502"   
  compute.googleapis.com/resource_name:  "gcpnoelevationcall-01211413-b90e-harness-n1wd"   
  compute.googleapis.com/resource_type:  "instance"   
  dataflow.googleapis.com/job_id:  "2018-01-21_14_13_45"   
  dataflow.googleapis.com/job_name:  "gcpnoelevationcall"   
  dataflow.googleapis.com/region:  "global"   
 }
 logName:  "projects/poc/logs/dataflow.googleapis.com%2Fshuffler"  
 receiveTimestamp:  "2018-01-21T22:41:40.053806623Z"  
 resource: {
  labels: {
   job_id:  "2018-01-21_14_13_45"    
   job_name:  "gcpnoelevationcall"    
   project_id:  "poc"    
   region:  "global"    
   step_id:  ""    
  }
  type:  "dataflow_step"   
 }
 severity:  "ERROR"  
 timestamp:  "2018-01-21T22:41:39.524005Z"  
}

Any ideas, on how could I help this? Has anyone faced a similar issue before?

Upvotes: 1

Views: 142

Answers (1)

Guillem Xercavins
Guillem Xercavins

Reputation: 7058

If this just happened once it could be attributed to a transient issue. The process running on the worker node can't reach cAdvisor. Either the cAdvisor container is not running or there is a temporal problem on the worker that can't contact cAdvisor and the job gets stuck.

Upvotes: 2

Related Questions