Mark Jones
Mark Jones

Reputation: 11

Reading CSV file in python and summing the values in the row

I am trying to solve the following problem:

Complete the function sumRows so that it reads a file of this format and returns a dictionary whose keys specify the names and whose values are the sum of numerical values in the corresponding row. For example, the record above would result in an entry 'dave': 14.0. Empty or non-numerical fields (other than the names in the first column) should be ignored.

My code attempt below just doesn't seem to be working, and I am not sure I fully understand the problem.

def sumRows(filename, header=False):
    d ={}
    with open(filename) as csvfile:
        headerline = csvfile.next()
        total = 0
        for row in csv.reader(csvfile):
            total += int(row[1])
        print(total)

for the csv file

rows1

dave,3,5,6
tina,12,3,5

Upvotes: 0

Views: 3596

Answers (1)

abarnert
abarnert

Reputation: 365707

The first problem in your code is this:

headerline = csvfile.next()

Iterators (files, CSV readers, etc.) in Python don't have a next method.* There's a next function that takes an iterator as an argument, like this:

headerline = next(csvfile)

If I fix that, your code prints out the sum of all the values in the second column.

But you're supposed to be summing rows, not columns.

To fix that, you need to iterate the columns of each row:

    for row in csv.reader(csvfile):
        rowtotal = 0
        for column in row[1:]:
            rowtotal += int(column)
        print(row[0], rowtotal)    

Now we're getting closer, but you're still got four problems you'll need to fix.

  • "Empty or non-numerical fields … should be ignored", but your code doesn't do that, it raises a ValueError. So, you need to try to convert each column to an int, and handle the possible ValueError in some appropriate way.

  • The question asks about "numbers", not "integers", and it gives 14.0 as an example. So int probably isn't the right type here. You may want float or decimal.Decimal. See Numbers in the tutorial for more.

  • You're not just supposed to print out each name and row sum, you're supposed to stick them in a dictionary and return that dictionary. You put something in a dictionary by doing d[key] = value, so hopefully you can figure out how to put the names and row sums into your d. And then just return d at the end.

  • That header=False parameter must be there for some reason. My guess is that you're supposed to use it to let the caller specify whether there's a header line to skip over, instead of you just always skipping over the header line no matter what. So, you'll need an if header: somewhere.


* This is only true for Python 3.x, but that's what you seem to be using. If you're not, you're probably using 2.7, where iterators do have a next method, but you're still not supposed to call it.

Upvotes: 1

Related Questions