Max0u
Max0u

Reputation: 720

High latency PyKafka

We are making a real-time product, which receive images and send back information associated. For scalability purpose we decided to use Kafka to balance the workload among Kubenertes nodes.

Frontend -> MainWorker -> Kafka (1) -> Worker -> Kafka (2) -> MainWorker -> Frontend

For some reasons we have an unexpected 80-100 ms between producer and consumer for Kafka (1) and same for kafka (2)

The latency is the same for cloud deployed as local deployed

Producer 1

client = KafkaClient(hosts=kafka_url)
topic = client.topics['frames']
producer = topic.get_producer(min_queued_messages=1)
data =  {"frame": message,"uid": self.user, "size": len(message), "time": str(start)}
data = json.dumps(data, separators=(',', ':'))
data = data.encode()
producer.produce(data)

Producer/Consumer 2

consumer = topic.get_balanced_consumer(consumer_group='detection', zookeeper_connect=zookeeper_url, auto_commit_enable=True, reset_offset_on_start=True)
while True:
    message = consumer.consume()
    if message is None:
        continue
    start = time.time()
    message =  json.loads(message.value.decode('utf-8'))
    before = datetime.datetime.strptime(message["time"], '%Y-%m-%d %H:%M:%S.%f')
    after = datetime.datetime.now()
    print(f"Bouncing 1: {int((after - before).total_seconds() * 1000)}ms")
    ...
    producer.produce(response_json.encode('utf-8'))

We have tried few things to reduce latency, but this looks likes not affecting the latency. both Producer/Consumer need to have the lowest latency. We don't care about throughput.

Upvotes: 4

Views: 97

Answers (0)

Related Questions