Reputation: 87
I got a CSV file where first row are headers, then other rows are data in columns.
I am using python to parse this data into the list of dictionaries
Normally I would use this code:
def csv_to_list_of_dictionaries(file):
with open(file) as f:
a = []
for row in csv.DictReader(f, skipinitialspace=True):
a.append({k: v for k, v in row.items()})
return a
but because data in one column are stored in dictionary, this code doesn't work (it separates key:value pairs in this dictionary
so data in my csv file looks like this:
col1,col2,col3,col4
1,{'a':'b', 'c':'d'},'bla',sometimestamp
dictionary from this is created as this: {col1:1, col2:{'a':'b', col3: 'c':'d'}, col4: 'bla'}
What I wish to have as result is: {col1:1, col2:{'a':'b', 'c':'d'}, col3: 'bla', col4: sometimestamp}
Upvotes: 1
Views: 185
Reputation: 23783
Don't use the csv module use a regular expression to extract the fields from each row. Then make dictionaries from the extracted rows.
Example file:
col1,col2,col3,col4
1,{'a':'b', 'c':'d'},'bla',sometimestamp
2,{'a':'b', 'c':'d'},'bla',sometimestamp
3,{'a':'b', 'c':'d'},'bla',sometimestamp
4,{'a':'b', 'c':'d'},'bla',sometimestamp
5,{'a':'b', 'c':'d'},'bla',sometimestamp
6,{'a':'b', 'c':'d'},'bla',sometimestamp
.
import re
pattern = r'^([^,]*),({.*}),([^,]*),([^,]*)$'
regex = re.compile(pattern,flags=re.M)
def csv_to_list_of_dictionaries(file):
with open(file) as f:
columns = next(f).strip().split(',')
stuff = regex.findall(f.read())
a = [dict(zip(columns,values)) for values in stuff]
return a
stuff = csv_to_list_of_dictionaries(f)
In [20]: stuff
Out[20]:
[{'col1': '1',
'col2': "{'a':'b', 'c':'d'}",
'col3': "'bla'",
'col4': 'sometimestamp'},
{'col1': '2',
'col2': "{'a':'b', 'c':'d'}",
'col3': "'bla'",
'col4': 'sometimestamp'},
{'col1': '3',
'col2': "{'a':'b', 'c':'d'}",
'col3': "'bla'",
'col4': 'sometimestamp'},
{'col1': '4',
'col2': "{'a':'b', 'c':'d'}",
'col3': "'bla'",
'col4': 'sometimestamp'},
{'col1': '5',
'col2': "{'a':'b', 'c':'d'}",
'col3': "'bla'",
'col4': 'sometimestamp'},
{'col1': '6',
'col2': "{'a':'b', 'c':'d'}",
'col3': "'bla'",
'col4': 'sometimestamp'}]
Upvotes: 2