Reputation: 1171
I have a big text file and i want to convert it into CSV using Python. My data looks like:
var1|var2|var3|tonumber|fromnumber|var|coding|udh|var|circle|var|var|var|var15
898980d1-6e5b-40f2-a313-c30f08bf0fe6|49A5919EB0D04EDE9B6CEB5AF932EAA3|sbs1|919899980898|HITECH|1|1|0|VODAFONE|Delhi|2015-02-21 12:08:51|5|3|RBA/6724R # Kailash Ram Panwar (PL) # Rz-410/13 Flat No-09 Iiird Floor Tkd Extn Delhi - 110019-110019 # Tgt Skt # #
How can I convert this file to CSV? I tried:
In [1]: import csv
In [2]: import pandas as pd
In [3]: piperows = []
f = open("/home/suri/ValueFirst/MT.txt", "rb")
In [6]: readerpipe = csv.reader(f, delimiter = '|')
In [7]: for row in readerpipe:
...: piperows.append(row)
...: f.close()
...:
And I got the below error:
----------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-7-842b0d42f436> in <module>()
----> 1 for row in readerpipe:
2 piperows.append(row)
3 f.close()
4
ValueError: I/O operation on closed file
Upvotes: 4
Views: 14806
Reputation: 4228
Like @Martijn Pieters suggested, you should not have indented f.close()
this way because it is now part of the loop. I would suggest using a with block, which will take care of automatically closing the file.
import csv
with open("/home/suri/ValueFirst/MT.txt", "rb") as f:
readerpipe = csv.reader(f, delimiter='|')
piperows = list(readerpipe)
One thing here is that we build a big list of all the rows, which might be a bad idea if you are converting files. You could probably write the new comma-separated version, as you read the pipe-separated version.
import csv
with open("/home/suri/ValueFirst/MT.txt", "rb") as file_pipe:
reader_pipe = csv.reader(file_pipe, delimiter='|')
with open("/home/suri/ValueFirst/MT.csv", 'wb') as file_comma:
writer_comma = csv.writer(file_comma, delimiter=',')
for row in reader_pipe:
writer_comma.writerow(row)
Edit: @Martijn suggests to pass the reader directly to the writer's writerows
method... If that writerows
method is implemented correctly it will have the same effect and avoid loading all the rows in memory at once.
import csv
with open("/home/suri/ValueFirst/MT.txt", "rb") as file_pipe:
reader_pipe = csv.reader(file_pipe, delimiter='|')
with open("/home/suri/ValueFirst/MT.csv", 'wb') as file_comma:
writer_comma = csv.writer(file_comma, delimiter=',')
writer_comma.writerows(reader_pipe)
Edit 2: The code becomes so simple that you could inline the reader and writer variables and get the following, if you like...
import csv
with open("/home/suri/ValueFirst/MT.txt", "rb") as file_pipe:
with open("/home/suri/ValueFirst/MT.csv", 'wb') as file_comma:
csv.writer(file_comma, delimiter=',').writerows(csv.reader(file_pipe, delimiter='|'))
Upvotes: 8
Reputation: 1121484
You are closing the file after reading the first row:
for row in readerpipe:
piperows.append(row)
f.close()
Remove the f.close()
line from the loop.
Better still, use the file as a context manager so that it is closed automatically for you. You can simply call list()
on the reader to produce your output list:
with open("/home/suri/ValueFirst/MT.txt", "rb") as f:
readerpipe = csv.reader(f, delimiter = '|')
piperows = list(readerpipe)
but to convert your file, you can pass in readerpipe
straight into a writer.writerows()
call:
with open("/home/suri/ValueFirst/MT.txt", "rb") as f:
readerpipe = csv.reader(f, delimiter = '|')
with open("/home/suri/ValueFirst/MT.txt", "wb") as outputfile:
writer = csv.writer(outputfile)
writer.writerows(readerpipe)
Upvotes: 1