lucy
lucy

Reputation: 4506

Limit consuming data by Kafka in spark streaming

I am working on spark streaming project. Spark getting data from kafka. I want to limit the records consume by spark-streaming. There is very huge amount of data on kafka. I have using spark.streaming.kafka.maxRatePerPartition=1 property to limit the record in spark. But still in 5 min batch I am getting 13400 messages. My spark program could not handle more than 1000 messages per 5mins. Kafka topic have 3 partitions. My spark driver memory is 5GB and have 3 executor with 3GB each. How I can limit message consume by kafka in spark streaming.

Upvotes: 0

Views: 645

Answers (1)

Liju John
Liju John

Reputation: 1876

Did you try setting below props ?

spark.streaming.backpressure.enabled
spark.streaming.backpressure.initialRate

Upvotes: 1

Related Questions