Reputation: 23677

Sum each column in csv

Given a csv file

A,0,0,1,0
B,0,0,1,0
C,0,0,1,0
D,0,0,1,0
E,0,0,1,0
F,0,0,0,1

I'd like to compute the totals for each column. Is there a more pythonic or efficient way to do this than:

import csv

totals = [0]*4

for row in csv.reader(csvfile):
    counts = [ int(x) for x in row[-4:] ]
    totals = [ sum(x) for x in zip(counts, totals) ]
print(totals)

Upvotes: 2

Answers (4)

Max Shouman

Reputation: 1331

Here's a comprehensive way that could do the job without external libs:

matrix = [[int(i) for i in row[-4:]] for row in csv.reader(csvfile)]

totals = [sum(array[i] for array in matrix) for i in range(4)]

Upvotes: 3

Jean-François Fabre

Reputation: 140168

transpose the csv file beforehand, skip the now title column and just compute sum on each row

cr = zip(*csv.reader(csvfile))
next(cr)

result = [sum(map(int,x)) for x in cr]
print(result)

[0, 0, 5, 1]

careful as it loads the whole file in memory when expanding the arguments for zip, though.

Upvotes: 5

It_is_Chris

Reputation: 14093

Use pandas

import pandas as pd

df = pd.read_csv('path/to/file.csv', header=None, index_col=0)
df.sum()

here is a sample using StringIO

from io import StringIO
import pandas as pd

s = """A,0,0,1,0
B,0,0,1,0
C,0,0,1,0
D,0,0,1,0
E,0,0,1,0
F,0,0,0,1"""

df = pd.read_csv(StringIO(s), header=None, index_col=0)
print(df.sum())

1    0
2    0
3    5
4    1

Upvotes: 1

Tomerikoo

Reputation: 19414

You can use numpy's genfromtxt to read the file, then slice the index column and sum the array:

import numpy as np

my_data = np.genfromtxt(csvfile, delimiter=',')
print(my_data[:,1:].sum(axis=0))

Gives:

[0. 0. 5. 1.]

Upvotes: 1

Sum each column in csv

Answers (4)

Related Questions