Yngve
Yngve

Reputation: 783

Write Python dictionary to CSV where where keys= columns, values = rows

I have a list of dictionaries that I want to be able to open in Excel, formatted correctly. This is what I have so far, using csv:

list_of_dicts = [{'hello': 'goodbye'}, {'yes': 'no'}]
out_path= "/docs/outfile.txt"
out_file = open(ipath, 'wb')

writer = csv.writer(ofile, dialect = 'excel')

for items in list_of_dicts:
    for k,v in items.items():
        writer.writerow([k,v])

Obviously, when I open the output in Excel, it's formatted like this:

key  value
key  value

What I want is this:

key   key   key

value value value

I can't figure out how to do this, so help would be appreciated. Also, I want the column names to be the dictionary keys, in stead of the default 'A, B, C' etc. Sorry if this is stupid.

Thanks

Upvotes: 5

Views: 18864

Answers (3)

Maciek Woźniak
Maciek Woźniak

Reputation: 455

I think that the most useful is to write the column by column, so each key is a column (good for later on data processing and use for e.g. ML).

I had some trouble yesterday figuring it out but I came up with the solution I saw on some other website. However, from what I see it is not possible to go through the whole dictionary at once and we have to divide it on smaller dictionaries (my csv file had 20k rows at the end - surveyed person, their data and answers. I did it like this:

    # writing dict to csv
    # 'cleaned' is a name of the output file 
    
    # 1 header 
    # fildnames is going to be columns names 
    
    # 2 create writer 
    writer = csv.DictWriter(cleaned, d.keys())
    
    # 3 attach header 
    writer.writeheader()
    
    # write separate dictionarties 
    for i in range(len(list(d.values())[0])):
        
        writer.writerow({key:d[key][i] for key in d.keys()}) 

I see my solution has one more for loop but from the other hand, I think it takes less memory (but, I am not sure!!) Hope it'd help somebody ;)

Upvotes: 0

Martijn Pieters
Martijn Pieters

Reputation: 1123520

You need to write 2 separate rows, one with the keys, one with the values, instead:

writer = csv.writer(ofile, dialect = 'excel')

writer.writerow([k for d in list_of_dicts k in d])
writer.writerow([v for d in list_of_dicts v in d.itervalues()])

The two list comprehensions extract first all the keys, then all the values, from the dictionaries in your input list, combining these into one list to write to the CSV file.

Upvotes: 2

Alex Willmer
Alex Willmer

Reputation: 564

The csv module has a DictWriter class for this, which is covered quite nicely in another SO answer. The critical point is that you need to know all your column headings when you instantiate the DictWriter. You could construct the list of field names from your list_of_dicts, if so your code becomes

list_of_dicts = [{'hello': 'goodbye'}, {'yes': 'no'}]
out_path= "/docs/outfile.txt"
out_file = open(out_path, 'wb')

fieldnames = sorted(list(set(k for d in list_of_dicts for k in d)))
writer = csv.DictWriter(out_file, fieldnames=fieldnames, dialect='excel')

writer.writeheader() # Assumes Python >= 2.7
for row in list_of_dicts:
    writer.writerow(row)
out_file.close()

The way I've constructed fieldnames scans the entire list_of_dicts, so it will slow down as the size increases. You should instead construct fieldnames directly from the source of your data e.g. if the source of your data is also a csv file you can use a DictReader and use fieldnames = reader.fieldnames.

You can also replace the for loop with a single call to writer.writerows(list_of_dicts) and use a with block to handle file closure, in which case your code would become

list_of_dicts = [{'hello': 'goodbye'}, {'yes': 'no'}]
out_path= "/docs/outfile.txt"

fieldnames = sorted(list(set(k for d in list_of_dicts for k in d)))

with open(out_path, 'wb') as out_file:
    writer = csv.DictWriter(out_file, fieldnames=fieldnames, dialect='excel')
    writer.writeheader()
    writer.writerows(list_of_dicts)

Upvotes: 6

Related Questions