How to write dict in certain format to file

Question

Assume I have multiple lists of dict's, something along the lines of

list_one = [{'genre': 'Action', 'amount': 141, 'meanScore': 82}, {'genre': 'Comedy', 'amount': 191, 'meanScore': 82}]

list_two = [{'genre': 'Horror', 'amount': 11, 'meanScore': 62}, {'genre': 'Comedy', 'amount': 191, 'meanScore': 82}]

My goal is to write this to a file in the form

           Action       Comedy       Horror      
list_one  meanScore   meanScore    
           amount       amount       
list_two              meanScore     meanScore
                        amount       amount

I'm not really familiar with dict and what the best way is to store them, but it seems that csv- files are quite popular for that. I tried to use this answer here to solve my problem, but I'm having difficulty in understanding what @MarkLongair does and how you would expand that to my problem. One of the main things that concerns me is that not every genre is part of every list, so I don't know how to check in the existing csv file if the key exists, where it is located and how to write the value into the right column.

Since I couldn't really understand the linked answer I tried something along the lines of

from pandas import DataFrame

list_one = [{'genre': 'Action', 'amount': 141, 'meanScore': 82},
            {'genre': 'Comedy', 'amount': 191, 'meanScore': 82}]

list_two = [{'genre': 'Horror', 'amount': 11, 'meanScore': 62}, 
            {'genre': 'Comedy', 'amount': 191, 'meanScore': 82}]

DataFrame(list_one).to_csv('test.csv')
DataFrame(list_two).to_csv('test.csv')

which doesn't really work since the data get's overwritten and the things I wanted to be the columns get transformed to rows...

I'm not sure how to go on form here or what exactly the right direction is... Can somebody maybe help a bit?

Bill M. · Accepted Answer

One way to get around this without using Pandas [EDIT: I see you've since edited to mention this] is to make a function that looks at one of your dictionaries, and composes the appropriate line of CSV text.

def generate_row(separator, headers, data_type, data_list, list_name):
    data_by_genre = {k: '' for k in headers}
    for data in data_list:
        data_by_genre[data['genre']] = str(data[data_type])

    output_text = separator.join([data_by_genre[genre] for genre in headers]) + '
'
    # If it's 'amount', then the row starts with the name. Otherwise that space is blank.
    if data_type == 'amount':
        output_text = list_name + output_text

    return output_text


list_one = [{'genre': 'Action', 'amount': 141, 'meanScore': 82}, {'genre': 'Comedy', 'amount': 191, 'meanScore': 82}]
list_two = [{'genre': 'Horror', 'amount': 11, 'meanScore': 62}, {'genre': 'Comedy', 'amount': 191, 'meanScore': 82}]

headers = ['', 'Action', 'Comedy', 'Horror']
separator = ','

f = open('new.csv', 'w')
f.write(separator.join(headers))
f.write('
')
f.write(generate_row(separator, headers, 'amount', list_one, 'list_one'))
f.write(generate_row(separator, headers, 'meanScore', list_one, 'list_one'))
f.write(generate_row(separator, headers, 'amount', list_two, 'list_two'))
f.write(generate_row(separator, headers, 'meanScore', list_two, 'list_two'))
f.close()

I made "separator" a variable in case you want to use e.g. tab-delimited instead of commas.

If you want to use Pandas though, you can write something to reformat your data to look like this, so it writes "correctly".

data1 = [{'Action': 141, 'Comedy': 191, 'Horror': None},
         {'Action': 82, 'Comedy': 82, 'Horror': None},
         {'Action': None, 'Comedy': 191, 'Horror': 11},
         {'Action': None, 'Comedy': 82, 'Horror': 62}]

DataFrame(data1).to_csv('test.csv')

How to write dict in certain format to file

Answers (2)

Related Questions