Reputation: 425
I am trying to calculate the sum of all values from a csv file with the first column of a row containing a key. All this data is to be placed in a dictionary in Python.
I have come up with this code so far. The only problem is that not all values are integer, some are blank and contain strings. I need to update the code to ignore these.
An obvious improvement would be to calculate the amount of column the file has instead of assuming it has up to three columns of data, I'm not quite sure how to implement this though!
import csv
d = {}
with open(filename) as csvfile:
rdr = csv.reader(csvfile)
if header == True:
next(rdr, None)
for row in rdr:
d[row[0]] = int(row[1]) + int(row[2]) + int(row[3])
return d
I appreciate any help!
Upvotes: 1
Views: 1604
Reputation: 2984
Take a look at Numpy - it makes life a lot easier
from numpy import genfromtxt
import numpy as np
my_data = genfromtxt('my_file.csv', delimiter=',', dtype=str)
d = {}
for i in my_data:
subset = i[1:] # create a subset to from index 1 to end
subset[subset == ''] = '0' # change all empty spaces to zero
d[i[0]] = np.sum(subset.astype(float))
Upvotes: 2
Reputation: 180401
use a try/except
casting each element to float
:
import csv
from collections import defaultdict
with open(filename) as csvfile:
next(csvfile)
rdr = csv.reader(csvfile)
d = defaultdict(float)
for row in rdr:
for v in row[1:]:
try:
d[row[0]] += float(v)
except ValueError:
pass
print(d)
If the value can be cast to float the key's value will be incremented, if not we catch the error and move on.
Input:
a,b,c,d
1,"foo",3,""
2,5,"fuzz",12.12
3,"","bar",33.3
Output:
defaultdict(<class 'float'>, {'1': 3.0, '2': 17.119999999999997, '3': 33.3})
Upvotes: 2