How to convert pipe-separated text file to CSV?

Question

I have a big text file and i want to convert it into CSV using Python. My data looks like:

var1|var2|var3|tonumber|fromnumber|var|coding|udh|var|circle|var|var|var|var15

898980d1-6e5b-40f2-a313-c30f08bf0fe6|49A5919EB0D04EDE9B6CEB5AF932EAA3|sbs1|919899980898|HITECH|1|1|0|VODAFONE|Delhi|2015-02-21 12:08:51|5|3|RBA/6724R # Kailash Ram Panwar (PL) # Rz-410/13 Flat No-09 Iiird Floor Tkd Extn Delhi - 110019-110019 # Tgt Skt #  #

How can I convert this file to CSV? I tried:

In [1]: import csv

In [2]: import pandas as pd

In [3]: piperows = []  

f = open("/home/suri/ValueFirst/MT.txt", "rb")

In [6]: readerpipe = csv.reader(f, delimiter = '|')

In [7]: for row in readerpipe: 
   ...:     piperows.append(row)
   ...:     f.close()  
   ...:

And I got the below error:

----------------------------------------------------
ValueError                      Traceback (most recent call last)  
 in ()  
----> 1 for row in readerpipe:  
      2     piperows.append(row)  
      3     f.close()  
      4   

ValueError: I/O operation on closed file

snooze92 · Accepted Answer

Like @Martijn Pieters suggested, you should not have indented f.close() this way because it is now part of the loop. I would suggest using a with block, which will take care of automatically closing the file.

import csv

with open("/home/suri/ValueFirst/MT.txt", "rb") as f:
    readerpipe = csv.reader(f, delimiter='|')
    piperows = list(readerpipe)

One thing here is that we build a big list of all the rows, which might be a bad idea if you are converting files. You could probably write the new comma-separated version, as you read the pipe-separated version.

import csv

with open("/home/suri/ValueFirst/MT.txt", "rb") as file_pipe:
    reader_pipe = csv.reader(file_pipe, delimiter='|')
    with open("/home/suri/ValueFirst/MT.csv", 'wb') as file_comma:
        writer_comma = csv.writer(file_comma, delimiter=',')
        for row in reader_pipe:
            writer_comma.writerow(row)

Edit: @Martijn suggests to pass the reader directly to the writer's writerows method... If that writerows method is implemented correctly it will have the same effect and avoid loading all the rows in memory at once.

import csv

with open("/home/suri/ValueFirst/MT.txt", "rb") as file_pipe:
    reader_pipe = csv.reader(file_pipe, delimiter='|')
    with open("/home/suri/ValueFirst/MT.csv", 'wb') as file_comma:
        writer_comma = csv.writer(file_comma, delimiter=',')
        writer_comma.writerows(reader_pipe)

Edit 2: The code becomes so simple that you could inline the reader and writer variables and get the following, if you like...

import csv

with open("/home/suri/ValueFirst/MT.txt", "rb") as file_pipe:
    with open("/home/suri/ValueFirst/MT.csv", 'wb') as file_comma:
        csv.writer(file_comma, delimiter=',').writerows(csv.reader(file_pipe, delimiter='|'))

How to convert pipe-separated text file to CSV?

Answers (2)

Related Questions