Reputation: 332

Python: Effective reading from a file using csv module

I have just started learning csv module recently. Suppose we have this CSV file:

John,Jeff,Judy,
21,19,32,
178,182,169,
85,74,57,

And we want to read this file and create a dictionary containing names (as keys) and totals of each column (as values). So in this case we would end up with:

d = {"John" : 284, "Jeff" : 275, "Judy" : 258}

So I wrote this code which apparently works well, but I am not satisfied with it and was wondering if anyone knows of better or more efficient/elegant way of doing this. Because there's just too many lines in there :D (Or maybe a way we could generalize it a bit - i.e. we would not know how many fields are there.)

d = {}
import csv
with open("file.csv") as f:
    readObject = csv.reader(f)

    totals0 = 0
    totals1 = 0
    totals2 = 0
    totals3 = 0

    currentRowTotal = 0
    for row in readObject:
        currentRowTotal += 1
        if currentRowTotal == 1:
            continue

        totals0 += int(row[0])
        totals1 += int(row[1])
        totals2 += int(row[2])
        if row[3] == "":
            totals3 += 0

f.close()

with open(filename) as f:
    readObject = csv.reader(f)
    currentRow = 0
    for row in readObject:   
        while currentRow <= 0:
            d.update({row[0] : totals0}) 
            d.update({row[1] : totals1}) 
            d.update({row[2] : totals2})
            d.update({row[3] : totals3}) 
            currentRow += 1
    return(d)
f.close()

Thanks very much for any answer :)

Upvotes: 1

Answers (4)

Shawn Zhang

Reputation: 1884

Base on Michasel's solution, I would try with less code and less variables and no dependency on Numpy:

import csv

with open("so.csv") as f:
  reader = csv.reader(f)
  titles = next(reader)
  sum_result = reduce(lambda x,y: [ int(a)+int(b) for a,b in zip(x,y)], list(reader))

  print dict(zip(titles, sum_result))

Upvotes: 0

Marcin

Reputation: 238269

Not sure if you can use pandas, but you can get your dict as follows:

import pandas as pd
df = pd.read_csv('data.csv')
print(dict(df.sum()))

Gives:

{'Jeff': 275, 'Judy': 258, 'John': 284}

Upvotes: 3

Michael Laszlo

Reputation: 12239

Use the top row to figure out what the column headings are. Initialize a dictionary of totals based on the headings.

import csv

with open("file.csv") as f:
  reader = csv.reader(f)

  titles = next(reader)
  while titles[-1] == '':
    titles.pop()
  num_titles = len(titles)      
  totals = { title: 0 for title in titles }

  for row in reader:
    for i in range(num_titles):
      totals[titles[i]] += int(row[i])

print(totals)

Let me add that you don't have to close the file after the with block. The whole point of with is that it takes care of closing the file.

Also, let me mention that the data you posted appears to have four columns:

John,Jeff,Judy,
21,19,32,
178,182,169,
85,74,57,

That's why I did this:

  while titles[-1] == '':
    titles.pop()

Upvotes: 0

MikeyB

Reputation: 3350

It's a little dirty, but try this (operating without the empty last column):

#!/usr/bin/python

import csv
import numpy

with open("file.csv") as f:
    reader = csv.reader(f)
    headers = next(reader)

    sums = reduce(numpy.add, [map(int,x) for x in reader], [0]*len(headers))
    for name, total in zip(headers,sums):
        print("{}'s total is {}".format(name,total))

Upvotes: 0

Python: Effective reading from a file using csv module

Answers (4)

Related Questions