MatanRubin
MatanRubin

Reputation: 1085

Deserializing Java org.apache.kafka.common.serialization serialized objects with Python

I have a Java Kafka producer that uses org.apache.kafka.common.serialization.LongSerializer as the key serializer and I'm trying to consume messages from the topic using a Python Kafka consumer.

I thought that since LongSerializer is part of org.apache.kafka, an equivalent serializer and deserializer would be available in all official Kafka clients for other languages, to promote interoperability. However, I couldn't find it.

So, are people supposed to use org.apache.kafka.common.serialization only for projects which are pure JVM, or is there some other way to deserialize these objects using Python?

I feel like I'm missing something because I find it hard to believe Kafka provides serializers and deserializers out of the box which do not promote communication between processes written in different languages...

Upvotes: 2

Views: 775

Answers (1)

itmightgetloud
itmightgetloud

Reputation: 31

If anyone still needs an answer a year after it was asked, you can just re-implement Java's LongSerializer in Python:

    def long_deserializer(data: bytes):
      if not data:
        return None
      if len(data) != 8:
        raise Exception(f"Size of data received by long_deserializer is not 8. Received {len(data)}")

      # 0xF...FFFFFFFF is always -1
      if data == b'\xff\xff\xff\xff\xff\xff\xff\xff':
        return -1

      value = 0
      for b in data:
        value <<= 8
        value |= b & 0xFF

      return value

The example usage would be as follows:

  • Input:

b'\x00\x00\x00\x00\x00\x00\x04\xd2

  • decoding part:

x04\xd2

  • decoding to binary:

0x04=100, d=1010 , 2=0010

  • result:

10010100010

  • translate to int (the return value):

10010100010 = 1234

Also, if you want a more "pythonic" way, you can use builtin functions:

int('0x' + data.hex().lstrip('0'), 0)

Upvotes: 3

Related Questions