Reputation: 4506
I am working on spark streaming project. Spark getting data from kafka. I want to limit the records consume by spark-streaming. There is very huge amount of data on kafka. I have using spark.streaming.kafka.maxRatePerPartition=1
property to limit the record in spark. But still in 5 min batch I am getting 13400 messages. My spark program could not handle more than 1000 messages per 5mins. Kafka topic have 3 partitions. My spark driver memory is 5GB and have 3 executor with 3GB each. How I can limit message consume by kafka in spark streaming.
Upvotes: 0
Views: 645
Reputation: 1876
Did you try setting below props ?
spark.streaming.backpressure.enabled
spark.streaming.backpressure.initialRate
Upvotes: 1