Reputation: 783
I have a list of dictionaries that I want to be able to open in Excel, formatted correctly. This is what I have so far, using csv:
list_of_dicts = [{'hello': 'goodbye'}, {'yes': 'no'}]
out_path= "/docs/outfile.txt"
out_file = open(ipath, 'wb')
writer = csv.writer(ofile, dialect = 'excel')
for items in list_of_dicts:
for k,v in items.items():
writer.writerow([k,v])
Obviously, when I open the output in Excel, it's formatted like this:
key value
key value
What I want is this:
key key key
value value value
I can't figure out how to do this, so help would be appreciated. Also, I want the column names to be the dictionary keys, in stead of the default 'A, B, C' etc. Sorry if this is stupid.
Thanks
Upvotes: 5
Views: 18864
Reputation: 455
I think that the most useful is to write the column by column, so each key is a column (good for later on data processing and use for e.g. ML).
I had some trouble yesterday figuring it out but I came up with the solution I saw on some other website. However, from what I see it is not possible to go through the whole dictionary at once and we have to divide it on smaller dictionaries (my csv file had 20k rows at the end - surveyed person, their data and answers. I did it like this:
# writing dict to csv
# 'cleaned' is a name of the output file
# 1 header
# fildnames is going to be columns names
# 2 create writer
writer = csv.DictWriter(cleaned, d.keys())
# 3 attach header
writer.writeheader()
# write separate dictionarties
for i in range(len(list(d.values())[0])):
writer.writerow({key:d[key][i] for key in d.keys()})
I see my solution has one more for loop but from the other hand, I think it takes less memory (but, I am not sure!!) Hope it'd help somebody ;)
Upvotes: 0
Reputation: 1123520
You need to write 2 separate rows, one with the keys, one with the values, instead:
writer = csv.writer(ofile, dialect = 'excel')
writer.writerow([k for d in list_of_dicts k in d])
writer.writerow([v for d in list_of_dicts v in d.itervalues()])
The two list comprehensions extract first all the keys, then all the values, from the dictionaries in your input list, combining these into one list to write to the CSV file.
Upvotes: 2
Reputation: 564
The csv module has a DictWriter class for this, which is covered quite nicely in another SO answer. The critical point is that you need to know all your column headings when you instantiate the DictWriter. You could construct the list of field names from your list_of_dicts, if so your code becomes
list_of_dicts = [{'hello': 'goodbye'}, {'yes': 'no'}]
out_path= "/docs/outfile.txt"
out_file = open(out_path, 'wb')
fieldnames = sorted(list(set(k for d in list_of_dicts for k in d)))
writer = csv.DictWriter(out_file, fieldnames=fieldnames, dialect='excel')
writer.writeheader() # Assumes Python >= 2.7
for row in list_of_dicts:
writer.writerow(row)
out_file.close()
The way I've constructed fieldnames scans the entire list_of_dicts
, so it will slow down as the size increases. You should instead construct fieldnames
directly from the source of your data e.g. if the source of your data is also a csv file you can use a DictReader and use fieldnames = reader.fieldnames
.
You can also replace the for
loop with a single call to writer.writerows(list_of_dicts)
and use a with
block to handle file closure, in which case your code would become
list_of_dicts = [{'hello': 'goodbye'}, {'yes': 'no'}]
out_path= "/docs/outfile.txt"
fieldnames = sorted(list(set(k for d in list_of_dicts for k in d)))
with open(out_path, 'wb') as out_file:
writer = csv.DictWriter(out_file, fieldnames=fieldnames, dialect='excel')
writer.writeheader()
writer.writerows(list_of_dicts)
Upvotes: 6