Reputation: 95
I have a folder with several CSV files. These files all contain the box drawing double vertical and horizontal as the delimiter. I am trying to import all these files into python, change that delimiter to a pipe, and then save the new files into another location. The code I currently have runs without any errors but doesn't actually do anything. Any suggestions?
import os
import pandas as pd
directory = 'Y:/Data'
dirlist = os.listdir(directory)
file_dict = {}
x = 0
for filename in dirlist:
if filename.endswith('.csv'):
file_dict[x] = pd.read_csv(filename)
column = file_dict[x].columns[0]
file_dict[x] = file_dict[x][column].str.replace('╬', '|')
file_dict[x].to_csv("python/file{}.csv".format(x))
x += 1
Here is a picture of sample data:
Upvotes: 1
Views: 4594
Reputation: 1073
with i as open(filename):
with o as open(filename+'.new', 'w+):
for line in i.readlines():
o.write(line.replace('╬', '|'))
or, skip the python, and use sed
from your terminal:
$ sed -i 's/╬/|/g' *.csv
Assuming the original delimiter doesn't appear in any escaped strings, this should be slightly faster than using the regular csv
module. Panada seems to do some filesystem voodoo when reading CSVs, so I wouldn't be too surprised if it is just as fast. sed
will almost certainly beat them both by far.
Upvotes: 1
Reputation: 2909
Instead of directly replacing occurrences with the new character (which may replace escaped occurrences of the character as well), we can just use built-in functionality in the csv library to read the file for us, and then write it again
import csv
with open('myfile.csv', newline='') as infile, open('outfile.csv', 'w', newline='') as outfile:
reader = csv.reader(infile, delimiter='╬')
writer = csv.writer(outfile, delimiter='|')
for row in reader:
writer.writerow(row)
Adapted from the docs
Upvotes: 1