Biplov
Biplov

Reputation: 1142

CSV header based on JSON dictionary keys

I have a JSON like this for example:

[
    {
        'name':'test1',
        'random1':'random_1_value'
    },
    {
        'name':'test2',
        'random2':'random_2_value'
        'random4':'random_4_value'
    },
    {
        'name':'test3',
        'random3':'random_3_value'
    },
]

I want to convert this JSON and construct CSV header based on dictionary keys. Then fill each row respectively. Expected output:

name, random1, random2, random4, random3
test1, random_1_value
test2, ,random_2_value, random_4_value, ,
test3, , , , random_3_value

This is my code so far:

data = json.loads(open('output_data.json').read())
csvwriter = csv.writer(open("output.csv", "w"))
count = 0
for emp in data:
    if count == 0:
        header = emp.keys()
        csvwriter.writerow(header)
        count += 1
    csvwriter.writerow(emp.values())

Upvotes: 1

Views: 924

Answers (2)

snakecharmerb
snakecharmerb

Reputation: 55630

You can use a combination of the collections and csv modules in the standard library to do this.

We can use a collections.OrderedDict to get the column names in the order in which they appear.

We use an OrderedDict so that the code will work with versions of Python earlier than 3.7. From Python 3.7 onwards the language guarantees that ordinary dictionaries remember the order in which their keys were inserted. If your code will only be run by Python 3.7+ you can use the dict builtin instead of an OrderedDict.

headers = collections.OrderedDict((key, None) for dict_ in data for key in dict_) 

Now that the column names have been generated, use a csv.DictWriter to write each dictionary in the data to the output file. DictWriter automatically handles missing keys, ensuring that each data item is written to the right column.

with open('output.csv', 'w', newlines='') as f:
    writer = csv.DictWriter(f, fieldnames=headers)
    # Write the column names to file.
    writer.writeheader()
    # Write the data to file.
    writer.writerows(data)

Upvotes: 1

aydow
aydow

Reputation: 3801

You should use the with operator when reading and writing files. See here

# Open the JSON file
>>> with open('csv.json') as f:
...     data = json.load(f)

# Get the column names
>>> col = [k for d in data for k in d]

# Create a matrix of the data
>>> matrix = [[d['name']] + [d.get(c, '') for c in col] for d in data]

>>> csv_data = [['name'] + col] + matrix

>>> with open('json.csv', 'w') as f:
...     writer = csv.writer(f)
...     writer.writerows(csv_data)

Upvotes: 0

Related Questions