Joseph N
Joseph N

Reputation: 560

Error while using dataflow Kafka to bigquery template

I am using dataflow kafka to bigquery template. after launching the dataflow job, it stays in queue for some time then fails with below error:

Error occurred in the launcher container: Template launch failed. See console logs.

When looking at the logs, I see the following stack trace:

at org.apache.beam.runners.dataflow.DataflowRunner.run(DataflowRunner.java:192) 
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:317) 
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:303) 
at com.google.cloud.teleport.v2.templates.KafkaToBigQuery.run(KafkaToBigQuery.java:343) 
at com.google.cloud.teleport.v2.templates.KafkaToBigQuery.main(KafkaToBigQuery.java:222) 
Caused by: org.apache.kafka.common.errors.TimeoutException: Timeout expired while fetching topic metadata –

While lauching job, i have provided below parameters:

  1. kafka topic name
  2. bootstrap server name
  3. bigquery topic name
  4. SA email
  5. zone.

My kafka topic only contanis message: hello

kafka is installed in gcp instance which is in same zone and subnet as dataflow worker.

Upvotes: 2

Views: 1212

Answers (2)

Joseph N
Joseph N

Reputation: 560

Hey Thanks guys for help , i was trying to access kafka with internal ip. it worked when i ched it to public ip. Actually i am running both kafka machines and workers in same subnet. so it should work with internal ip also... i am checking it now

Upvotes: 0

Sameer Abhyankar
Sameer Abhyankar

Reputation: 261

Adding this here as an answer for posterity:

"Timeout expired while fetching topic metadata" indicates that the the Kafka client is unable to connect to the broker(s) to fetch the metadata. This could be due to various reasons such as the worker VMs unable to talk to the broker (are you talking over public or private ips? Check incoming firewall settings if using public ips). It could also be due to an incorrect port or due to the broker requiring SSL connections. One way to confirm is to install the Kafka client on a GCE VM in the same subnet as Dataflow workers and then verify that the kafka client can connect to the Kafka brokers.

Refer to [1] to configure the ssl settings for the Kafka client (which you can test using the cli on a GCE instance). The team that manages the broker(s) can tell you whether they require SSL connection.

[1] https://docs.confluent.io/platform/current/kafka/authentication_ssl.html#clients

Upvotes: 5

Related Questions