Reputation: 23677
Given a csv file
A,0,0,1,0
B,0,0,1,0
C,0,0,1,0
D,0,0,1,0
E,0,0,1,0
F,0,0,0,1
I'd like to compute the totals for each column. Is there a more pythonic or efficient way to do this than:
import csv
totals = [0]*4
for row in csv.reader(csvfile):
counts = [ int(x) for x in row[-4:] ]
totals = [ sum(x) for x in zip(counts, totals) ]
print(totals)
Upvotes: 2
Views: 139
Reputation: 1331
Here's a comprehensive way that could do the job without external libs:
matrix = [[int(i) for i in row[-4:]] for row in csv.reader(csvfile)]
totals = [sum(array[i] for array in matrix) for i in range(4)]
Upvotes: 3
Reputation: 140168
transpose the csv file beforehand, skip the now title column and just compute sum on each row
cr = zip(*csv.reader(csvfile))
next(cr)
result = [sum(map(int,x)) for x in cr]
print(result)
[0, 0, 5, 1]
careful as it loads the whole file in memory when expanding the arguments for zip
, though.
Upvotes: 5
Reputation: 14093
Use pandas
import pandas as pd
df = pd.read_csv('path/to/file.csv', header=None, index_col=0)
df.sum()
here is a sample using StringIO
from io import StringIO
import pandas as pd
s = """A,0,0,1,0
B,0,0,1,0
C,0,0,1,0
D,0,0,1,0
E,0,0,1,0
F,0,0,0,1"""
df = pd.read_csv(StringIO(s), header=None, index_col=0)
print(df.sum())
1 0
2 0
3 5
4 1
Upvotes: 1
Reputation: 19414
You can use numpy's genfromtxt
to read the file, then slice the index column and sum
the array:
import numpy as np
my_data = np.genfromtxt(csvfile, delimiter=',')
print(my_data[:,1:].sum(axis=0))
Gives:
[0. 0. 5. 1.]
Upvotes: 1