b1ackzer0
b1ackzer0

Reputation: 165

Trying to write python CSV extractor

I am complete newbie for programming and this is my first real program I am trying to write.

So I have this huge CSV file (hundreds of cols and thousands of rows) where I am trying to extract only few columns based on value in the field. It works fine and I get nice output, but the problem arises when I am try to encapsulate the same logic in a function. it returns only first extracted row however print works fine.

I have been playing for this for hours and read other examples here and now my mind is mush.

import csv
import sys

newlogfile = csv.reader(open(sys.argv[1], 'rb'))
outLog = csv.writer(open('extracted.csv', 'w'))

def rowExtractor(logfile):
    for row in logfile:
        if row[32] == 'No':
            a = []
            a.append(row[44])
            a.append(row[58])
            a.append(row[83])
            a.append(row[32])
            return a

outLog.writerow(rowExtractor(newlogfile))

Upvotes: 1

Views: 121

Answers (2)

dave
dave

Reputation: 12806

You are exiting prematurely. When you put return a inside the for loop, return gets called on the first iteration. Which means that only the firs iteration runs.

A simple way to do this would be to do:

def rowExtractor(logfile):
    #output holds all of the rows
    ouput = []
    for row in logfile:
        if row[32] == 'No':
            a = []
            a.append(row[44])
            a.append(row[58])
            a.append(row[83])
            a.append(row[32])
            output.append(a)
    #notice that the return statement is outside of the for-loop
    return output
outLog.writerows(rowExtractor(newlogfile))

You could also consider using yield

Upvotes: 1

mpen
mpen

Reputation: 283043

You've got a return statement in your function...when it hits that line, it will return (thus terminating your loop). You'd need yield instead.

See What does the "yield" keyword do in Python?

Upvotes: 1

Related Questions