surendra
surendra

Reputation: 1171

How to convert pipe-separated text file to CSV?

I have a big text file and i want to convert it into CSV using Python. My data looks like:

var1|var2|var3|tonumber|fromnumber|var|coding|udh|var|circle|var|var|var|var15

898980d1-6e5b-40f2-a313-c30f08bf0fe6|49A5919EB0D04EDE9B6CEB5AF932EAA3|sbs1|919899980898|HITECH|1|1|0|VODAFONE|Delhi|2015-02-21 12:08:51|5|3|RBA/6724R # Kailash Ram Panwar (PL) # Rz-410/13 Flat No-09 Iiird Floor Tkd Extn Delhi - 110019-110019 # Tgt Skt #  #

How can I convert this file to CSV? I tried:

In [1]: import csv

In [2]: import pandas as pd

In [3]: piperows = []  

f = open("/home/suri/ValueFirst/MT.txt", "rb")

In [6]: readerpipe = csv.reader(f, delimiter = '|')

In [7]: for row in readerpipe: 
   ...:     piperows.append(row)
   ...:     f.close()  
   ...:  

And I got the below error:

----------------------------------------------------
ValueError                      Traceback (most recent call last)  
<ipython-input-7-842b0d42f436> in <module>()  
----> 1 for row in readerpipe:  
      2     piperows.append(row)  
      3     f.close()  
      4   

ValueError: I/O operation on closed file  

Upvotes: 4

Views: 14806

Answers (2)

snooze92
snooze92

Reputation: 4228

Like @Martijn Pieters suggested, you should not have indented f.close() this way because it is now part of the loop. I would suggest using a with block, which will take care of automatically closing the file.

import csv

with open("/home/suri/ValueFirst/MT.txt", "rb") as f:
    readerpipe = csv.reader(f, delimiter='|')
    piperows = list(readerpipe)

One thing here is that we build a big list of all the rows, which might be a bad idea if you are converting files. You could probably write the new comma-separated version, as you read the pipe-separated version.

import csv

with open("/home/suri/ValueFirst/MT.txt", "rb") as file_pipe:
    reader_pipe = csv.reader(file_pipe, delimiter='|')
    with open("/home/suri/ValueFirst/MT.csv", 'wb') as file_comma:
        writer_comma = csv.writer(file_comma, delimiter=',')
        for row in reader_pipe:
            writer_comma.writerow(row)

Edit: @Martijn suggests to pass the reader directly to the writer's writerows method... If that writerows method is implemented correctly it will have the same effect and avoid loading all the rows in memory at once.

import csv

with open("/home/suri/ValueFirst/MT.txt", "rb") as file_pipe:
    reader_pipe = csv.reader(file_pipe, delimiter='|')
    with open("/home/suri/ValueFirst/MT.csv", 'wb') as file_comma:
        writer_comma = csv.writer(file_comma, delimiter=',')
        writer_comma.writerows(reader_pipe)

Edit 2: The code becomes so simple that you could inline the reader and writer variables and get the following, if you like...

import csv

with open("/home/suri/ValueFirst/MT.txt", "rb") as file_pipe:
    with open("/home/suri/ValueFirst/MT.csv", 'wb') as file_comma:
        csv.writer(file_comma, delimiter=',').writerows(csv.reader(file_pipe, delimiter='|'))

Upvotes: 8

Martijn Pieters
Martijn Pieters

Reputation: 1121484

You are closing the file after reading the first row:

for row in readerpipe: 
    piperows.append(row)
    f.close()  

Remove the f.close() line from the loop.

Better still, use the file as a context manager so that it is closed automatically for you. You can simply call list() on the reader to produce your output list:

with open("/home/suri/ValueFirst/MT.txt", "rb") as f:
    readerpipe = csv.reader(f, delimiter = '|')
    piperows = list(readerpipe)

but to convert your file, you can pass in readerpipe straight into a writer.writerows() call:

with open("/home/suri/ValueFirst/MT.txt", "rb") as f:
    readerpipe = csv.reader(f, delimiter = '|')
    with open("/home/suri/ValueFirst/MT.txt", "wb") as outputfile:
        writer = csv.writer(outputfile)
        writer.writerows(readerpipe)

Upvotes: 1

Related Questions