D P
D P

Reputation: 153

Parsing failing on python for nested avro structure

I have an avro structure which is expecting array structure. I have created the avro structure but the parsing my data into that avro schema is failing

Avro schema

{
  "namespace": "com",
  "type": "record",
  "name": "customers",
  "fields": [
    {
      "name": "customer",
      "type": {
        "type": "array",
        "items": {
          "name": "cust",
          "type": "record",
          "fields": [
            {
              "name": "age",
              "type": ["long","null"]
            },
            {
              "name": "amount",
              "type": [ "long","null"]
            }
          ]
        }
      }
    }
  ]
}

Python Code

list= [[34,2000],[53,8000]]

for l in list

    writer.append({"customer":{ "age": long(l[0]), "amount": long(l[1])}})

is my parsing wrong? should I add any datum object to array?

Upvotes: 0

Views: 725

Answers (2)

D P
D P

Reputation: 153

Figured it out . My avro schema was all good.Only thing is I had to change the was the way I add the objects to the writer.

list= [[34,2000],[53,8000]]
customer={}                                                                                                                  
cust ={}                                                                                                     
for l in list                                                                                                             
    cust['age']  = l[0]  
    cust['amount'] = l[1]                                     
customer.append(cust)                                                                                                            
writer.append({"customer": customer})   

Upvotes: 0

Scott
Scott

Reputation: 2074

Your schema defines the customers record as having an array of cust records. Therefore, your data should be structured like so:

{"customer": [cust1, cust2, ...]}

and to expand further:

{"customer": [{"age": X1, "amount": Y1}, {"age": X2, "amount": Y2}, ...]}

So you can keep your schema as is, but you will need to change the data you are inserting to match the format above. Alternatively, you could keep your data as is, but you would need to change the schema to the following:

{
  "namespace": "com",
  "type": "record",
  "name": "customers",
  "fields": [
    {
      "name": "customer",
      "type": {
        "name": "cust",
        "type": "record",
        "fields": [
          {
            "name": "age",
            "type": ["long","null"]
          },
          {
            "name": "amount",
            "type": [ "long","null"]
          }
        ]
      }
    }
  ]
}

Upvotes: 1

Related Questions