Igor Abdrakhimov
Igor Abdrakhimov

Reputation: 118

Multiple consumers on single Kafka topic are slow

I have Kafka cluster on 3 servers. There is a topic with one partition and 3 replicas. Average message is about 200 Bytes.

I want multiple consumers (i.e. with different group IDs) to read from the topic, so each consumer receives all the data.

The problem is that each new consumer is slower than the previous, so after adding about 20 consumers new consumers are very slow.

The following table shows the problem:

topic        consumer    current offset
topic-0      group-1     4191232
topic-0      group-4     3860979
topic-0      group-2     3799224
topic-0      group-12    2112518
topic-0      group-7     1984491
topic-0      group-3     1842349
topic-0      group-6     1695504
topic-0      group-11    1388133
topic-0      group-5     1383794
topic-0      group-19    1242424
topic-0      group-16    941960 
topic-0      group-14    876551 
topic-0      group-22    837359 
topic-0      group-21    828698 
topic-0      group-13    811273 
topic-0      group-26    716414 
topic-0      group-9     699175 
topic-0      group-18    621772 
topic-0      group-15    617520 
topic-0      group-17    613233 
topic-0      group-10    388891 
topic-0      group-8     328258 
topic-0      group-24    233805 
topic-0      group-29    131299 
topic-0      group-23    84658  
topic-0      group-20    80492  
topic-0      group-27    63527  
topic-0      group-25    50720  
topic-0      group-28    46474  
topic-0      group-30    37958  

These consumers were started almost at the same time, and this state was captured after about 20 seconds. group-1 read 4.19 million records, and group-30 read only 37958 records.

Consumers distribution differs from run to run, but always there are slow consumers.

I've tried to run consumers on dedicated servers, and locally on Kafka cluster - situation didn't change.

Log messages on slow consumers show that round-trip-time is high, sometimes is more than a second

kafka3:9092/3: Sent FetchRequest (v4, 93 bytes @ 0, CorrId 36322)
kafka3:9092/3: Received FetchResponse (v4, 1048636 bytes, CorrId 36322, rtt 747.24ms)

This problem is reproducible with kafka console consumer and librdkafka, so I think something wrong with brokers.

I've set num.io.threads and num.network.threads parameters in broker configs to 32, it didn't help. Other parameters are default.

Any help will be appreciated.

UPDATE 1

Log message for slow consumer on broker shows that problem is definitely at broker side:

[2018-03-07 12:58:42,787] DEBUG Completed request:RequestHeader(apiKey=OFFSET_COMMIT, apiVersion=1, clientId=rdkafka, correlationId=376) -- {group_id=group-12,generation_id=13,member_id=rdkafka-5c08ffd4,topics=[{topic=test-topic,partitions=[{partition=0,offset=651909,timestamp=-1,metadata=}]}]},response:{responses=[{topic=test-topic,partition_responses=[{partition=0,error_code=0}]}]} from connection kafka3:9092-client12:37884-10;totalTime:1547.433,requestQueueTime:0.104,localTime:0.631,remoteTime:1546.48,throttleTime:0.019,responseQueueTime:0.046,sendTime:0.15,securityProtocol:PLAINTEXT,principal:User:ANONYMOUS,listener:PLAINTEXT (kafka.request.logger)

remoteTime is 1.5 seconds

So question here is where should I look on the broker side to resolve problem?

Upvotes: 2

Views: 2134

Answers (1)

Igor Abdrakhimov
Igor Abdrakhimov

Reputation: 118

The problem is that consumers occupy all available network on the broker server.

Kafka probably sends responses to consumers in some determined order (by connection time as far as I can see). So we got a few very fast consumers, and bunch of consumers with reasonable speed. Other consumers are slow and only disconnection of "fast" consumers may help them.

Upvotes: 1

Related Questions