metersk
metersk

Reputation: 12539

Write list of nested dictionaries to excel file in python

I have a list of nested dictionaries that looks like this:

[{'posts': {'item_1': 1,
                            'item_2': 8,
                            'item_3': 105,
                            'item_4': 324,
                            'item_5': 313, }},
                 {'edits': {'item_1': 1,
                            'item_2': 8,
                            'item_3': 61,
                            'item_4': 178,
                            'item_5': 163}},
                 {'views': {'item_1': 2345,
                            'item_2': 330649,
                            'item_3': 12920402,
                            'item_4': 46199102,
                            'item_5': 43094955}}]

I would like to write it to an excel file in this format:

+--------+-------+-------+-----------+
|        | posts | edits |   views   |
+--------+-------+-------+-----------+
| item_1 |     1 |     1 |      2345 |
| item_2 |     8 |     8 |    330649 |
| item_3 |   105 |    61 |  12920402 |
| item_4 |   324 |   178 |  46199102 |
| item_5 |   313 |   163 | 430949955 |
+--------+-------+-------+-----------+

I am using the xlsxwriter library and trying the following and variations on the following without success:

for item in data:
    for col_name, data in item.iteritems():
        col += 1
        worksheet.write(row, col, col_name)
        for row_name, row_data in data.iteritems():
            col += 1
            worksheet.write(row, col, row_name)
            worksheet.write(row + 1, col, row_data)

I'm wondering if it makes sense to rework my nested dictionary object or is it possible to write to excel in it's current form?

When I say without much success i mean, that I can get it to write certain thigns to the excel file, like column names or row or the data, but I am unable to get it to write like pictured above. I'm not getting errors, I suspect i jsut don't know how to unpack this object properly to loop through it. In the code above, I am given a combination of row and column names on row 1 and all of the values on row 2.

My output for the code above is:

+--+-------+--------+--------+--------+--------+--------+-------+--------+--------+--------+--------+--------+-------+----------+----------+--------+----------+--------+
|  | posts | item_4 | item_5 | item_2 | item_3 | item_1 | edits | item_4 | item_5 | item_2 | item_3 | item_1 | views |  item_4  |  item_5  | item_2 |  item_3  | item_1 |
+--+-------+--------+--------+--------+--------+--------+-------+--------+--------+--------+--------+--------+-------+----------+----------+--------+----------+--------+
|  |       |    324 |    313 |      8 |    105 |      1 |       |    178 |    163 |      8 |     61 |      1 |       | 46199102 | 43094955 | 330649 | 12920402 |   2345 |
+--+-------+--------+--------+--------+--------+--------+-------+--------+--------+--------+--------+--------+-------+----------+----------+--------+----------+--------+

Upvotes: 2

Views: 5098

Answers (3)

Chankey Pathak
Chankey Pathak

Reputation: 21676

import pandas as pd

data = [{'posts': {'item_1': 1,
                   'item_2': 8,
                   'item_3': 105,
                   'item_4': 324,
                   'item_5': 313, }
         },
        {'edits': {'item_1': 1,
                   'item_2': 8,
                   'item_3': 61,
                   'item_4': 178,
                   'item_5': 163}
         },
        {'views': {'item_1': 2345,
                   'item_2': 330649,
                   'item_3': 12920402,
                   'item_4': 46199102,
                   'item_5': 43094955}
         }]

final_df = pd.DataFrame()

for id in range(0,len(data)):
    df = pd.DataFrame.from_dict(data[id])
    final_df = pd.concat([final_df, df], axis=1)

print (final_df)

final_df.to_excel('data.xlsx')

Upvotes: 1

Martin Evans
Martin Evans

Reputation: 46779

As an alternative, this could be solved using csv as follows:

import csv
import itertools

nested = [
    {'posts': {'item_1': 1, 'item_2': 8, 'item_3': 105, 'item_4': 324, 'item_5': 313,}},
    {'edits': {'item_1': 1, 'item_2': 8, 'item_3': 61, 'item_4': 178, 'item_5': 163}},
    {'views': {'item_1': 2345, 'item_2': 330649, 'item_3': 12920402, 'item_4': 46199102, 'item_5': 43094955}}]

headings = [d.keys()[0] for d in nested]
entries = [sorted(nested[index][col].items()) for index, col in enumerate(headings)]    

with open('output.csv', 'wb') as f_output:
    csv_output = csv.writer(f_output)
    csv_output.writerow(['items'] + headings)

    for cols in itertools.izip_longest(*entries, fillvalue=['<n/a>']*len(entries[0])):
        csv_output.writerow([cols[0][0]] + [col[1] for col in cols])

This would give you output.csv as follows:

items,posts,edits,views
item_1,1,1,2345
item_2,8,8,330649
item_3,105,61,12920402
item_4,324,178,46199102
item_5,313,163,43094955

Upvotes: 2

David Zemens
David Zemens

Reputation: 53663

Presently you have a dict each of posts, edits, and views which are each keyed to your "items", seems redundant.

Alternatively, create a single dictionary keyed to your "items", and have the value of each item be a dictionary of posts, edits, views, like:

items = {}
items = {{'item_1': {'posts':1, 'edits':0, 'views':2345}
         {'item_2': {'posts':2, 'edits':8, 'views':330649}}

This way you can simply refer to items['item_2']['edits'] (which should yield 8) or items['item_1']['views'] (which should yield 2345), etc.

In your case, then something like:

# write the headers -- this could be refined
row = 0
worksheet.write(0, 1, 'posts')
worksheet.write(0, 2, 'edits')
worksheet.write(0, 3, 'views')

# write the data:
for itm in items:
    row += 1
    worksheet.write(row, 0, itm)
    for col, prop in enmumerate(items[itm]):
        worksheet.write(row, col+1, prop)

Upvotes: 2

Related Questions