denis
denis

Reputation: 11

Some of the Kafka partitions have a lot of lag

Good afternoon, everyone. I'm setting up reading data from a Kafka topica that consists of 24 partitions using ClickHouse. ClickHouse uses a table from Kafka ENGINE with the following settings:

CREATE TABLE logs.log_kafka_v2 (
    `_partition` UInt64,
    `_offset` UInt64,
    `user_id` String,
    `content_id` String,
    `scenario` String,
    `node_context_key` String,
    `is_result_node` String,
    `content` String,
    `score` String,
    `position` Int8,
    `request_id` String,
    `response_ts` String
) ENGINE = Kafka SETTINGS kafka_broker_list = 'kafka-navigator-prod:443',
kafka_topic_list = 'log',
kafka_group_name = 'log-click-consumer-test',
kafka_format = 'JSONEachRow',
kafka_thread_per_consumer = 1,
kafka_num_consumers = 24,
kafka_poll_max_batch_size = 800000,
kafka_commit_every_batch = 1,
kafka_poll_timeout_ms = 2200;

At the moment this configuration allows to read data from 18 partitions with good speed, but not how many (6-8) have a very large data lag, as you can see from the screenshot.

Grafana

My topic has about 400,000 - 500,000 posts per second. Has anyone encountered this situation ? What settings should be changed so that all partitions do not have lag ? Thanks in advance!

I've tried using different settings for the base and ClickHouse itself. The base itself is based in the cloud as SaaS, Kafka is on Kubernetes

Upvotes: 1

Views: 24

Answers (0)

Related Questions