hchong
hchong

Reputation: 45

How to convert nested JSON data to CSV using python?

I have a file consisting of an array containing over 5000 objects. However, I am having trouble converting one particular part of my JSON file into the appropriate columns in CSV format.

Below is an example version of my data file:

{
  "Result": {
    "Example 1": {
      "Type1": [
        {
          "Owner": "Name1 Example",
          "Description": "Description1 Example",
          "Email": "[email protected]",
          "Phone": "(123) 456-7890"
        }
      ]
    },
    "Example 2": {
      "Type1": [
        {
          "Owner": "Name2 Example",
          "Description": "Description2 Example",
          "Email": "[email protected]",
          "Phone": "(111) 222-3333"
        }
      ]
    }
  }
}

Here is my current code:

import csv
import json

json_file='example.json'
with open(json_file, 'r') as json_data:
    x = json.load(json_data)

f = csv.writer(open("example.csv", "w"))

f.writerow(["Address","Type","Owner","Description","Email","Phone"])

for key in x["Result"]:
    type = "Type1"
    f.writerow([key,
                type,
                x["Result"][key]["Type1"]["Owner"],
                x["Result"][key]["Type1"]["Description"],
                x["Result"][key]["Type1"]["Email"],
                x["Result"][key]["Type1"]["Phone"]])

My problem is that I'm encountering this issue:

Traceback (most recent call last):
  File "./convert.py", line 18, in <module>
    x["Result"][key]["Type1"]["Owner"],
TypeError: list indices must be integers or slices, not str

When I try to substitute the last array such as "Owner" to an integer value, I receive this error: IndexError: list index out of range.

When I strictly change the f.writerow function to

f.writerow([key,
                type,
                x["Result"][key]["Type1"]])

I receive the results in a column, but it merges everything into one column, which makes sense. Picture of the output: https://i.sstatic.net/p3qcH.jpg

I would like the results to be separated based on the label into individual columns instead of being merged into one. Could anyone assist?

Thank you!

Upvotes: 1

Views: 260

Answers (3)

Will
Will

Reputation: 1541

Type1 in your data structure is a list, not a dict. So you need to iterate over it instead of referencing by key.

for key in x["Result"]:
    # key is now "Example 1" etc.
    type1 = x["Result"][key]["Type1"]
    # type1 is a list, not a dict
    for i in type1:
        f.writerow([key,
                    "Type1",
                    type1["Owner"],
                    type1["Description"],
                    type1["Email"],
                    type1["Phone"]])

The inner for loop ensure that you're protected from the assumption that "Type1" only ever has one item in the list.

Upvotes: 2

hchong
hchong

Reputation: 45

Figured it out!

I changed the f.writerow function to the following:

for key in x["Result"]:
    type = "Type1"
    f.writerow([key,
                type,
                x["Result"][key]["Type1"][0]["Owner"],
                x["Result"][key]["Type1"][0]["Email"]])
                ...

This allowed me reference the keys within the object. Hopefully this helps someone down the line!

Upvotes: 0

Olvin Roght
Olvin Roght

Reputation: 7812

It's definately not the best example, but I'm to sleepy to optimize it.

import csv


def json_to_csv(obj, res):
    for k, v in obj.items():
        if isinstance(v, dict):
            res.append(k)
            json_to_csv(v, res)
        elif isinstance(v, list):
            res.append(k)
            for el in v:
                json_to_csv(el, res)
        else:
            res.append(v)


obj = {
  "Result": {
    "Example 1": {
      "Type1": [
        {
          "Owner": "Name1 Example",
          "Description": "Description1 Example",
          "Email": "[email protected]",
          "Phone": "(123) 456-7890"
        }
      ]
    },
    "Example 2": {
      "Type1": [
        {
          "Owner": "Name2 Example",
          "Description": "Description2 Example",
          "Email": "[email protected]",
          "Phone": "(111) 222-3333"
        }
      ]
    }
  }
}

with open("out.csv", "w+") as f:
    writer = csv.writer(f)
    writer.writerow(["Address","Type","Owner","Description","Email","Phone"])
    for k, v in obj["Result"].items():
        row = [k]
        json_to_csv(v, row)
        writer.writerow(row)

Upvotes: 1

Related Questions