Marius Jaeger
Marius Jaeger

Reputation: 35

How to write dict in certain format to file

Assume I have multiple lists of dict's, something along the lines of

list_one = [{'genre': 'Action', 'amount': 141, 'meanScore': 82}, {'genre': 'Comedy', 'amount': 191, 'meanScore': 82}]

list_two = [{'genre': 'Horror', 'amount': 11, 'meanScore': 62}, {'genre': 'Comedy', 'amount': 191, 'meanScore': 82}]

My goal is to write this to a file in the form

           Action       Comedy       Horror      
list_one  meanScore   meanScore    
           amount       amount       
list_two              meanScore     meanScore
                        amount       amount

I'm not really familiar with dict and what the best way is to store them, but it seems that csv- files are quite popular for that. I tried to use this answer here to solve my problem, but I'm having difficulty in understanding what @MarkLongair does and how you would expand that to my problem. One of the main things that concerns me is that not every genre is part of every list, so I don't know how to check in the existing csv file if the key exists, where it is located and how to write the value into the right column.

Since I couldn't really understand the linked answer I tried something along the lines of

from pandas import DataFrame

list_one = [{'genre': 'Action', 'amount': 141, 'meanScore': 82},
            {'genre': 'Comedy', 'amount': 191, 'meanScore': 82}]

list_two = [{'genre': 'Horror', 'amount': 11, 'meanScore': 62}, 
            {'genre': 'Comedy', 'amount': 191, 'meanScore': 82}]

DataFrame(list_one).to_csv('test.csv')
DataFrame(list_two).to_csv('test.csv')

which doesn't really work since the data get's overwritten and the things I wanted to be the columns get transformed to rows...

I'm not sure how to go on form here or what exactly the right direction is... Can somebody maybe help a bit?

Upvotes: 1

Views: 102

Answers (2)

Bill M.
Bill M.

Reputation: 1548

One way to get around this without using Pandas [EDIT: I see you've since edited to mention this] is to make a function that looks at one of your dictionaries, and composes the appropriate line of CSV text.

def generate_row(separator, headers, data_type, data_list, list_name):
    data_by_genre = {k: '' for k in headers}
    for data in data_list:
        data_by_genre[data['genre']] = str(data[data_type])

    output_text = separator.join([data_by_genre[genre] for genre in headers]) + '\n'
    # If it's 'amount', then the row starts with the name. Otherwise that space is blank.
    if data_type == 'amount':
        output_text = list_name + output_text

    return output_text


list_one = [{'genre': 'Action', 'amount': 141, 'meanScore': 82}, {'genre': 'Comedy', 'amount': 191, 'meanScore': 82}]
list_two = [{'genre': 'Horror', 'amount': 11, 'meanScore': 62}, {'genre': 'Comedy', 'amount': 191, 'meanScore': 82}]

headers = ['', 'Action', 'Comedy', 'Horror']
separator = ','

f = open('new.csv', 'w')
f.write(separator.join(headers))
f.write('\n')
f.write(generate_row(separator, headers, 'amount', list_one, 'list_one'))
f.write(generate_row(separator, headers, 'meanScore', list_one, 'list_one'))
f.write(generate_row(separator, headers, 'amount', list_two, 'list_two'))
f.write(generate_row(separator, headers, 'meanScore', list_two, 'list_two'))
f.close()

I made "separator" a variable in case you want to use e.g. tab-delimited instead of commas.

If you want to use Pandas though, you can write something to reformat your data to look like this, so it writes "correctly".

data1 = [{'Action': 141, 'Comedy': 191, 'Horror': None},
         {'Action': 82, 'Comedy': 82, 'Horror': None},
         {'Action': None, 'Comedy': 191, 'Horror': 11},
         {'Action': None, 'Comedy': 82, 'Horror': 62}]

DataFrame(data1).to_csv('test.csv')

Upvotes: 1

Falk Schuetzenmeister
Falk Schuetzenmeister

Reputation: 1597

In the first version of your question, you did not mention that you are operating within pandas which is really distinct from the Python standard libraries and important information. Pandas is not really needed to do this but I assume that you are using pandas for other reasons.

DataFrame(list1 + list2).to_csv('test.csv')

See also

How to add pandas data to an existing csv file?

if you want to append while writing rather combining the lists before turning into dataframe.

Other solutions outside pandas would be csv.DictWriter in the csv library or JSON serialization (if CSV is not a requirement).

Upvotes: 0

Related Questions