GionJh
GionJh

Reputation: 2894

Difference between kafka batch and kafka request

I was not able to find an satisfactory answer anywhere, sorry for if this question might look trivial:

In Kafka, on producer side, can a request contain multiple batches to different partitions ?
I see the words batch and requests are used as synonyms in the documentation, and I was hoping to find some clarity on this.

If yes, how does this affect the ack policy ?
Are acks on per batch or request basis ?

Upvotes: 2

Views: 1230

Answers (2)

Mickael Maison
Mickael Maison

Reputation: 26885

A Kafka request (and response) is a message sent over the network between a Kafka client and broker. The Kafka protocol uses many types of requests, you can find them all in the Kafka protocol documentation.

The Produce and Fetch requests are used to exchange records. They both contain Kafka batches, it's the RECORDS field in the protocol description. A Kafka batch is used to group several records together and saves some bytes by sharing the metadata for all records. You can find the exact format of a batch in the documentation.

TLDR: Requests/responses are the full messages exchanged between Kafka clients and brokers. Some requests contain Kafka batches that are groups of records.

Upvotes: 1

Fares
Fares

Reputation: 650

I'm not sure you are asking about producer or consumer side. Here are some info that might answer your question.

On producer side:

By default, Kafka producer will accumulate records in a batch up to 16KB. By default, the producer will have up to 5 requests in flight, meaning that 5 batches can be sent to Kafka at the same time. Meanwhile, the producer start to accumulate data for the next batches.

The acks config controls the number of brokers required to answer in order to consider each request successful.

On consumer side:

By default, the Kafka consumer regularly calls poll() to get a maximum of 500 records per poll.

Also by default, the Kafka consumer will ack every 5 seconds.

Meaning that the consumer will commit all the records that have been polled during the last 5 seconds by all the subsequent calls to poll().

Hope this helps!

Upvotes: 0

Related Questions