Siva Samraj
Siva Samraj

Reputation: 37

Spark Structured Streaming from Kafka to Elastic Search

I want to write a Spark Streaming Job from Kafka to Elasticsearch. Here I want to detect the schema dynamically while reading it from Kafka.

Can you help me to do that.?

I know, this can be done in Spark Batch Processing via below line.

val schema = spark.read.json(dfKafkaPayload.select("value").as[String]).schema

But while executing the same via Spark Streaming Job, we cannot do the above since streaming can have only on Action.

Please let me know.

Upvotes: 0

Views: 255

Answers (1)

Enes Uğuroğlu
Enes Uğuroğlu

Reputation: 407

If you are listening from kafka topic you can not rely on spark to automaticly infer json schema since it will take a lot of time. So somehow you need to provide your schema to your application.

If you are listening from file source you can do that though.

'spark.sql.streaming.schemaInference', 'true'

Upvotes: 1

Related Questions