Reputation: 715
I am new to python and just need a small help.
We have a Pipe delimited CSV file which looks like this
DATE|20160101
ID | Name | Address | City | State | Zip | Phone | OPEID | IPEDS |
10 | A... | 210 W.. | Mo.. | AL... | '31.. | 334.. | '01023 | 10063 |
20 | B... | 240 N.. | Ne.. | Ut... | '21.. | 335.. | '01024 | 10064 |
Every value of Zip and OPEID columns has apostrophes in the beginning
So we wish to create a new CSV file where apostrophes are removed from each value of these 2 columns.
The new file should then look like this:
DATE|20160101
ID | Name | Address | City | State | Zip | Phone | OPEID | IPEDS |
10 | A... | 210 W.. | Mo.. | AL... | 31.. | 334.. | 01023 | 10063 |
20 | B... | 240 N.. | Ne.. | Ut... | 21.. | 335.. | 01024 | 10064 |
This code works for copying data without removing apostrophes
import os
import csv
file1 = "D:\CSV\File1.csv"
with open(file1, 'rb') as csvfile:
reader = csv.reader(csvfile, delimiter = '|')
path = "D:/CSV/New"
if not os.path.exists(path):
os.makedirs(path)
writer = csv.writer(open(path+"File2"+".csv", 'wb'), delimiter = '|')
for row in reader:
writer.writerow(row)
csvfile.close()
Upvotes: 3
Views: 12209
Reputation: 1034
It worked for me... Try this.
res=[]
with open('hi.csv') as f:
content=csv.reader(f,delimiter='|')
for row in content:
for str in range (len(row)):
row[str]=row[str].replace('\'','')
res.append(row)
f.close()
with open('hi.csv','wb') as ff: # python 3 => 'wb' => 'w',newline=''
sw=csv.writer(ff,delimiter='|',quoting=csv.QUOTE_MINIMAL)
for rows in res:
sw.writerow(rows)
ff.close()
Upvotes: 1
Reputation: 249123
You can do it very efficiently with Pandas--this will be good if your file is very large:
import pandas as pd
import sys
with open('t.txt') as infile:
title = next(infile)
infile.seek(0)
table = pd.read_csv(infile, '|', header=1, dtype=str)
table.rename(columns={'Unnamed: 9':''}, inplace=True)
table[' Zip '] = table[' Zip '].str.replace("'", "")
table[' OPEID '] = table[' OPEID '].str.replace("'", "")
sys.stdout.write(title)
table.to_csv(sys.stdout, '|', index=False)
Upvotes: 3
Reputation: 1971
To remove apostrophes you can use the replace function, you just need to get the content of every cell one by one, and replace the apostrophes with:
new = old.replace("'", "")
More simply, open your csv file with any file editor and search and replace for "'".
Upvotes: 1
Reputation: 15204
The code below would be the same for all file formats. The fact that it is a *.csv doesn't change a thing. What it actually does, is that it goes in the file from which you want to remove the apostrophes, my_csv_in
, and parses it line by line each time replacing them with nothing (a.k.a removing). The modified lines are written in a second file, my_csv_out
.
my_csv_in = r'full_file_path_to_csv_in.csv'
my_csv_out = r'full_file_path_to_csv_out.csv'
with open(my_csv_in, 'r') as f_in:
with open(my_csv_out, 'w') as f_out:
for line in f_in:
f_out.write(line.replace("'", ''))
There are probably better ways to do this that take advantage of the file being a *.csv and using the csv
library. You can take a look at the quoting options
in the documentation.
Upvotes: -1