Reputation: 828
I am reading in a csv via csv.DictReader
and trying to replace any empty values with the None
value. DictReader
seems to take the file as an instance of dictionaries where each row of the CSV is a dictionary (which I am fine with). However when I try to iterate through it row/dictionary by row/dictionary and replace any empty values (""
) with None
I seem to get unstuck. I had previously written this as a list comprehension like this:
for row in data:
row = [None if not x else x for x in row]
But I need to switch to using dictionaries rather than lists. I've not had any experience with dictionary comprehensions before but when I try to extend this for dictionaries I just cant get it to work. I was thinking something along the lines of:
for row in data:
row.values() = [None if not x else x for x in row.values()}
but I just get SyntaxError: invalid syntax.
. I've tried a lot of other things (too many to list here) like:
for row in data:
row = {k:None for k,v in row if v not v else v}
but this seems to have the same problem.
For reference, my data looks like:
{'colour': 'ab6612', 'line': '1', 'name': 'Baker', 'stripe': ''}
{'colour': 'f7dc00', 'line': '3', 'name': '', 'stripe': 'FFFFFF'}
and would ideally end up as:
{'colour': 'ab6612', 'line': '1', 'name': 'Baker', 'stripe': None}
{'colour': 'f7dc00', 'line': '3', 'name': None, 'stripe': 'FFFFFF'}
Upvotes: 2
Views: 5498
Reputation: 1502
If you are using CSV and the data is too large please use iteritems()
this will save prevent the large list generation caused by items() Try:
new_data=[]
for row in data:
new_data.append({k:(v if v else None) for k,v in row.iteritems()})
if you dont understand comprehensions follow this simple for loop:
for row in data:
for k,v in row.iteritems():
if not v:
row[k]=None
the second method is easy to understand also does not create an additional list which is a better for higher performance
Upvotes: 0
Reputation: 91009
Your issue is that you are changing the name row
to reference a new dictionary in the for loop, this will not change anything inside your original list/DictReader object - data
.
If data is a list, you should enumerate over data
and change the dictionary inside data (or make that reference a new dictionary)
Example -
for i,row in enumerate(data):
data[i] = {k:(v if v else None) for k,v in row.items()}
Example test -
>>> data = [{1:2 , 3:''},{4:'',5:6}]
>>> for i,row in enumerate(data):
... data[i] = {k:(v if v else None) for k,v in row.items()}
...
>>> data
[{1: 2, 3: None}, {4: None, 5: 6}]
And since you are using DictReader class, you cannot directly, change the DictReader object, so you should create a new list , and add the changed row in the new list (or a DictWriter object, would prefer the DictWriter object) -
Example -
>>> newdata = []
>>> for row in data:
... newdata.append({k:(v if v else None) for k,v in row.items()})
Upvotes: 5
Reputation: 4196
Your main error is that you are trying to iterate twice over your dictionary whereas you only need to do it once.
Try:
data = {k:(v if v else None) for k,v in data.items()}
without the for-loop.
Upvotes: 0