Reputation: 165
I am complete newbie for programming and this is my first real program I am trying to write.
So I have this huge CSV file (hundreds of cols and thousands of rows) where I am trying to extract only few columns based on value in the field. It works fine and I get nice output, but the problem arises when I am try to encapsulate the same logic in a function. it returns only first extracted row however print works fine.
I have been playing for this for hours and read other examples here and now my mind is mush.
import csv
import sys
newlogfile = csv.reader(open(sys.argv[1], 'rb'))
outLog = csv.writer(open('extracted.csv', 'w'))
def rowExtractor(logfile):
for row in logfile:
if row[32] == 'No':
a = []
a.append(row[44])
a.append(row[58])
a.append(row[83])
a.append(row[32])
return a
outLog.writerow(rowExtractor(newlogfile))
Upvotes: 1
Views: 121
Reputation: 12806
You are exiting prematurely. When you put return a
inside the for
loop, return gets called on the first iteration. Which means that only the firs iteration runs.
A simple way to do this would be to do:
def rowExtractor(logfile):
#output holds all of the rows
ouput = []
for row in logfile:
if row[32] == 'No':
a = []
a.append(row[44])
a.append(row[58])
a.append(row[83])
a.append(row[32])
output.append(a)
#notice that the return statement is outside of the for-loop
return output
outLog.writerows(rowExtractor(newlogfile))
You could also consider using yield
Upvotes: 1
Reputation: 283043
You've got a return statement in your function...when it hits that line, it will return (thus terminating your loop). You'd need yield
instead.
See What does the "yield" keyword do in Python?
Upvotes: 1