marquillo
marquillo

Reputation: 87

Convert csv into list of dictionaries in python

I got a CSV file where first row are headers, then other rows are data in columns.

I am using python to parse this data into the list of dictionaries

Normally I would use this code:

def csv_to_list_of_dictionaries(file):
    with open(file) as f:
        a = []
        for row in csv.DictReader(f, skipinitialspace=True):
            a.append({k: v for k, v in row.items()})
        return a

but because data in one column are stored in dictionary, this code doesn't work (it separates key:value pairs in this dictionary

so data in my csv file looks like this:

col1,col2,col3,col4
1,{'a':'b', 'c':'d'},'bla',sometimestamp

dictionary from this is created as this: {col1:1, col2:{'a':'b', col3: 'c':'d'}, col4: 'bla'}

What I wish to have as result is: {col1:1, col2:{'a':'b', 'c':'d'}, col3: 'bla', col4: sometimestamp}

Upvotes: 1

Views: 185

Answers (1)

wwii
wwii

Reputation: 23783

Don't use the csv module use a regular expression to extract the fields from each row. Then make dictionaries from the extracted rows.

Example file:

col1,col2,col3,col4
1,{'a':'b', 'c':'d'},'bla',sometimestamp
2,{'a':'b', 'c':'d'},'bla',sometimestamp
3,{'a':'b', 'c':'d'},'bla',sometimestamp
4,{'a':'b', 'c':'d'},'bla',sometimestamp
5,{'a':'b', 'c':'d'},'bla',sometimestamp
6,{'a':'b', 'c':'d'},'bla',sometimestamp

.

import re
pattern = r'^([^,]*),({.*}),([^,]*),([^,]*)$'
regex = re.compile(pattern,flags=re.M)

def csv_to_list_of_dictionaries(file):
    with open(file) as f:
        columns = next(f).strip().split(',')
        stuff = regex.findall(f.read())
    a = [dict(zip(columns,values)) for values in stuff]
    return a

stuff = csv_to_list_of_dictionaries(f)

In [20]: stuff
Out[20]: 
[{'col1': '1',
  'col2': "{'a':'b', 'c':'d'}",
  'col3': "'bla'",
  'col4': 'sometimestamp'},
 {'col1': '2',
  'col2': "{'a':'b', 'c':'d'}",
  'col3': "'bla'",
  'col4': 'sometimestamp'},
 {'col1': '3',
  'col2': "{'a':'b', 'c':'d'}",
  'col3': "'bla'",
  'col4': 'sometimestamp'},
 {'col1': '4',
  'col2': "{'a':'b', 'c':'d'}",
  'col3': "'bla'",
  'col4': 'sometimestamp'},
 {'col1': '5',
  'col2': "{'a':'b', 'c':'d'}",
  'col3': "'bla'",
  'col4': 'sometimestamp'},
 {'col1': '6',
  'col2': "{'a':'b', 'c':'d'}",
  'col3': "'bla'",
  'col4': 'sometimestamp'}]

Upvotes: 2

Related Questions