Reputation: 57
System info:
Python 2.7.2
MAC OSX 10.7.2
Problem (+background):
I have a large '.csv' file (~1 gig) which needs some minor editing. Every value in the 5th column needs to be 5 characters long (some are 4 characters long, and need a '0' placed in front of them). The code (shown below) reports no errors when run, but stops writing with approximately 100 lines in the file left (thereby losing some crucial data!). Anyone know why this is happening?
I've re-created the 'read_file.csv' and inspected it, but I don't see anything out of place. The code always aborts in the same location, but I don't understand why?
import csv
path = '/Volumes/.../'
r = csv.reader(open(path + 'read_file.csv','rU'))
f = open(path + 'write_file.csv', 'wb')
writer = csv.writer(f)
for line in r:
if len(line[5]) == 4:
line[5] = '0' + line[5]
writer.writerow((line[0],line[1],line[2],line[3],line[4],line[5],line[6],line[7]))
Upvotes: 1
Views: 1303
Reputation: 3107
Ensure that the file is properly closed, with
makes it easy.
with open('test.csv', 'rU') as inp:
csvin=csv.reader(inp)
with open('output.csv', 'wb') as outp:
csvout=csv.writer(outp)
for line in csvin:
csvout.writerow(line[:4] + [line[4].rjust(5, '0')] + line[5:])
Upvotes: 0
Reputation: 3956
Either close the output file after writing it, or write the output in a with
context which will always close the file even if an error occurs:
with open('path + 'write_file.csv', 'wb') as f:
writer = csv.writer(f)
for line in r:
...
Upvotes: 1
Reputation: 12339
Things to check:
Are you examining this after your code has exited so you know the file has been .close()
or .flush()
ed?
Is it possible you have something odd in your data on that line that makes it think the rest of the file is data in a field?
You're only saving a set number of columns of your line; you might try writer.writerow(line)
instead...
Upvotes: 0