Reputation: 859
I have a csv file with the following structure:
"txNomeParlamentar";"ideCadastro";"nuCarteiraParlamentar";"nuLegislatura";"sgUF"
"AVANTE;1;1;2015;PP"
I need him to stay like this:
"txNomeParlamentar";"ideCadastro";"nuCarteiraParlamentar";"nuLegislatura";"sgUF"
"AVANTE";"1";"1";"2015";"PP"
I received this .csv file from someone else, so I do not know how the conversion was done. I am trying unsuccessfully with the code below:
input_fd = open("/home/gustavo/Downloads/Redes/Despesas/csvfile.csv", 'r')
output_fd = open('dados_2018_1.csv', 'w')
for line in input_fd.readlines():
line.replace("\"","")
output_fd.write(line)
input_fd.close()
output_fd.close()
Is it possible to make this change or will I have to do the conversion from an xml file to a csv, and make this change during the conversion?
Upvotes: 0
Views: 107
Reputation: 53663
First: tell the reader
to use delimiter=";"
and quoting=csv.QUOTE_NONE
. This will properly split your second line which is a string literal containing your delimiter, which you desire to be split. We'll tweak that data to remove the quotation marks (otherwise our output will be quoted strings like '"txNomeParlamentar"'
, etc).
import csv
with open('file.txt') as f:
reader = csv.reader(f, delimiter=";", quoting=csv.QUOTE_NONE)
data = [list(map(lambda s: s.replace('"', ''), row)) for row in reader]
Then: we write the file back out, with the delimiter=";"
, and quoting=csv.QUOTE_ALL
to ensure each item is set in quotes
with open('out.txt', 'w', newline='') as o:
writer = csv.writer(o, delimiter=";", quoting=csv.QUOTE_ALL)
writer.writerows(data)
Input:
"txNomeParlamentar";"ideCadastro";"nuCarteiraParlamentar";"nuLegislatura";"sgUF"
"AVANTE;1;1;2015;PP"
Output:
"txNomeParlamentar";"ideCadastro";"nuCarteiraParlamentar";"nuLegislatura";"sgUF"
"AVANTE";"1";"1";"2015";"PP"
Upvotes: 1
Reputation: 82785
Using csv
module.
Ex:
import csv
with open(filename) as csvfile:
reader = csv.reader(csvfile, delimiter=";")
headers = next(reader) #Read Headers
data = [row.strip('"').split(";") for row in csvfile] #Format data
with open(filename, "w") as csvfile_out:
writer = csv.writer(csvfile_out, delimiter=";")
writer.writerow(headers) #Write Headers
writer.writerows(data) #Write data
Upvotes: 1
Reputation: 123541
You could use the csv
module to do it if you massage the input data a little first.
import csv
#input_csv = '/home/gustavo/Downloads/Redes/Despesas/csvfile.csv'
input_csv = 'gustavo_input.csv'
output_csv = 'dados_2018_1.csv'
with open(input_csv, 'r', newline='') as input_fd, \
open(output_csv, 'w', newline='') as output_fd:
reader = csv.DictReader(input_fd, delimiter=';')
writer = csv.DictWriter(output_fd, delimiter=';',
fieldnames=reader.fieldnames,
quoting=csv.QUOTE_ALL)
first_field = reader.fieldnames[0]
for row in reader:
fields = row[first_field].split(';')
newrow = dict(zip(reader.fieldnames, fields))
writer.writerow(newrow)
print('done')
Upvotes: -1
Reputation: 11938
A couple things. First, you do NOT have a csv file because in a csv file, the delimiter is a comma by definition. I'm assuming you want the values in your data file to (1) remain separated by semicolons [why not fix it and make it commas?] and (2) you want each value to be in quotation marks.
If so, I think this will work:
# data reader
in_file = 'data.txt'
out_file = 'fixed.txt'
output = open(out_file, 'w')
with open(in_file, 'r') as source:
for line in source:
# split by semicolon
data = line.strip().split(';')
# remove all quotes found
data = [t.replace('"','') for t in data]
for item in data[:-1]:
output.write(''.join(['"', item, '"',';']))
# write the last item separately, without the trailing ';'
output.write(''.join(['"', item, '"']))
output.write('\n')
output.close()
If your target user is python, you should consider replacing the semicolons with commas (correct csv format) and forgoing the quotes. Everything python reads from csv is taken in as string anyhow.
Upvotes: 1