Reputation: 197
I am new in avro and I have a avro file to deserialize. Some schemas use fixed type of data to store MAC addresses. Below schema is one of those schemas and used in different schemas as a type.
The schema for MAC addresses like below:
{
"type": "fixed",
"name": "MacAddress",
"size": 6
}
I wrote the first record of the data to a text file using:
from avro.datafile import DataFileReader
from avro.io import DatumReader
reader = DataFileReader(open("data.avro", "rb"), DatumReader())
count = 0
for record in reader:
if count == 0:
with open('first_record.txt', 'w') as first_record:
first_record.write(str(record))
elif count > 0: break
count = count + 1
reader.close()
The above mentioned MAC addresses appears in the deserialized data like:
"MacAddress":"b""\\x36\\xe9\\xad\\x64\\x2d\\x3d",
I know that \x means the following is a hexadecimal value. So this is suppose to be "36:e9:ad:64:2d:3d", right? Are "b""" style values the expected output for fixed types?
Also, some values are like below:
"Addr":"b""j\\x26\\xb7\\xda\\x1d\\xf6"
"Addr":"b""\\x28\\xcb\\xc5v\\x14%"
How come these are MAC addresses? What does j, % characters means?
Upvotes: 1
Views: 760
Reputation: 2074
Are "b""" style values the expected output for fixed types?
Yes, since fixed types represent bytes and on Python a string of bytes is represented with a prepended b
before thing string. It looks like you have a lot of extra quotes in there and I'm guessing that's because you are doing things like str(record)
which is probably causing the extra backslashes and quote characters. For example:
>>> str(b"\xae")
"b'\\xae'"
How come these are MAC addresses? What does j, % characters means?
Are you sure these are the same record type? The key is Addr
instead of MacAddress
so it seems like it might be a different record type and schema.
Upvotes: 2