Reputation: 2570
I'm trying to exchanging serialized messages through a kafka broker using python 2.7 and Apache Avro(python client). I would like to know if there is a way for exchanging messages without creating a schema before.
This is the code (using a schema, sensor.avsc, the thing that i want to avoid):
from kafka import SimpleProducer, KafkaClient
import avro.schema
import io, random
from avro.io import DatumWriter
# To send messages synchronously
kafka = KafkaClient('localhost:9092')
producer = SimpleProducer(kafka, async = False)
# Kafka topic
topic = "sensor_network_01"
# Path to user.avsc avro schema that i don't want
schema_path="sensor.avsc"
schema = avro.schema.parse(open(schema_path).read())
for i in xrange(100):
writer = avro.io.DatumWriter(schema)
bytes_writer = io.BytesIO()
encoder = avro.io.BinaryEncoder(bytes_writer)
# creation of random data
writer.write({"sensor_network_name": "Sensor_1", "value": random.randint(0,10), "threshold_value":10 }, encoder)
raw_bytes = bytes_writer.getvalue()
producer.send_messages(topic, raw_bytes)
This is the sensor.avsc file:
{
"namespace": "sensors.avro",
"type": "record",
"name": "Sensor",
"fields": [
{"name": "sensor_network_name", "type": "string"},
{"name": "value", "type": ["int", "null"]},
{"name": "threshold_value", "type": ["int", "null"]}
]
}
Upvotes: 1
Views: 2722
Reputation: 4795
This code:
import avro.schema
import io, random
from avro.io import DatumWriter, DatumReader
import avro.io
# Path to user.avsc avro schema
schema_path="user.avsc"
schema = avro.schema.Parse(open(schema_path).read())
for i in xrange(1):
writer = avro.io.DatumWriter(schema)
bytes_writer = io.BytesIO()
encoder = avro.io.BinaryEncoder(bytes_writer)
writer.write({"name": "123", "favorite_color": "111", "favorite_number": random.randint(0,10)}, encoder)
raw_bytes = bytes_writer.getvalue()
print(raw_bytes)
bytes_reader = io.BytesIO(raw_bytes)
decoder = avro.io.BinaryDecoder(bytes_reader)
reader = avro.io.DatumReader(schema)
user1 = reader.read(decoder)
print(" USER = {}".format(user1))
for dealing with this schema
{"namespace": "example.avro",
"type": "record",
"name": "User",
"fields": [
{"name": "name", "type": "string"},
{"name": "favorite_number", "type": ["int", "null"]},
{"name": "favorite_color", "type": ["string", "null"]}
]
}
is what you need.
Credit goes to this gist
Upvotes: 4
Reputation: 29
I haven't seen anyone do this, but have wanted it myself. You might have to write it yourself, but it shouldn't be too bad - assuming the object to serialize is simple; all you'd have to do is loop through the fields and have a map from python types to avro types. Nested fields will require something like recursion to dig into each object.
Upvotes: 0