flowerflower
flowerflower

Reputation: 337

How to convert convert csv to list of dictionaries (UTF-8)?

I have a csv file (in.csv)

col1, col2, col3
Kapitän, Böse, Füller
...

and I want to create a list of dictionaries:

a = [{'col1': 'Kapitän',  'col2': 'Böse', 'col3': 'Füller'},{...}]

With Python 3 it's working with

    import codecs
    with codecs.open('in.csv', encoding='utf-8') as f:
        a = [{k: v for k, v in row.items()}
            for row in csv.DictReader(f, skipinitialspace=True)]
    print(a)

(I've got this code from convert csv file to list of dictionaries).

Unfortunately I need this for Python 2, but I don't come along with it.

I tried to understand https://docs.python.org/2.7/howto/unicode.html, but I think I'm too stupid, because

import codecs
f = codecs.open('in.csv', encoding='utf-8')
for line in f:
print repr(line) 

gives me

u'col1,col2,col3\n'
u'K\xe4pten,B\xf6se,F\xfcller\n'
u'\n'

Do you have a solution for Python 2?

There is a similar problem solved here: Creating a dictionary from a csv file? But with the marked solution I get ('K\xc3\xa4pten', 'B\xc3\xb6se', 'F\xc3\xbcller'). Maybe it's easy to edit it for getting [{u'col1': u'K\xe4pten', u'col2': u'B\xf6se', u'col3': u'F\xfcller'}]?

Upvotes: 0

Views: 824

Answers (2)

OrangeFish
OrangeFish

Reputation: 32

for print use print line instead print repr(line)

and for dict i use this solution

https://docs.python.org/2/library/csv.html#csv-examples

The csv module doesn’t directly support reading and writing Unicode

import codecs
import csv


def utf_8_encoder(unicode_csv_data):
    for line in unicode_csv_data:
        yield line.encode('utf-8')

def unicode_csv_reader(unicode_csv_data, dialect=csv.excel, **kwargs):
    # csv.py doesn't do Unicode; encode temporarily as UTF-8:
    csv_reader = csv.reader(utf_8_encoder(unicode_csv_data),
                            dialect=dialect, **kwargs)
    for row in csv_reader:
        # decode UTF-8 back to Unicode, cell by cell:
        yield [unicode(cell, 'utf-8') for cell in row]

with codecs.open('in.csv', encoding='utf-8') as f:
    reader = unicode_csv_reader(f)
    keys = [k.strip() for k in reader.next()]
    result = []
    for row in reader:
        d=dict(zip(keys, row))
        result.append(d)

    for d in result:
        for k, v in d.iteritems():
            print k, v
    print result

Upvotes: 0

Mike Tung
Mike Tung

Reputation: 4821

you can leverage the csv lib for the job.

import csv

li_of_dicts = []
with open('in.csv', 'r') as infile:
     reader = csv.DictReader(infile, encoding='utf-8')
     for row in reader:
         li_of_dicts.append(row)

Upvotes: 1

Related Questions