Reputation: 103
I have the following code, which takes information from an XML file and saves some data in a csv file.
import xml.etree.ElementTree as ET
import csv
tree = ET.parse('file.xml')
root = tree.getroot()
title = []
category = []
url = []
prod = []
def find_title():
for t in root.findall('solution/head'):
title.append(t.find('title').text)
for c in root.findall('solution/body'):
category.append(c.find('category').text)
for u in root.findall('solution/body'):
url.append(u.find('video').text)
for p in root.findall('solution/body'):
prod.append(p.find('product').text)
find_title()
headers = ['Title', 'Category', 'Video URL','Product']
def save_csv():
with open('titles.csv', 'w') as f:
f_csv = csv.writer(f, lineterminator='\r')
f_csv.writerow(headers)
f.write(''.join('{},{},{},{}\n'.format(title, category, url, prod) for title, category, url, prod in zip(title, category, url, prod)))
save_csv()
I have found an issue with the text that contains ',' because it separates the output save in the list e.g:
<title>Add, Change, or Remove Transitions between Slides</title>
is getting save in the list as [Add, Change, or Remove Transitions between Slides] which make sense since this is a csv file, however, I would like to keep the whole output together.
So I there any way to remove the ',' from the title tag or can I add more code to override the ','
Thanks in advance
Upvotes: 0
Views: 503
Reputation: 6190
It's not clear why you're writing the row data with a file.write()
call rather than using the csv writer's writerow
method (which you are using for the header row. Using that method will take care of quoting / special character issues wrt. data containing quotes and commas.
Change:
f.write(''.join('{},{},{},{}\n'.format(title, category, url, prod) for title, category, url, prod in zip(title, category, url, prod)))
to:
for row in zip(title, category, url, prod):
f_csv.writerow(row)
and your CSV should work as expected, assuming your CSV reader handles the quoted fields.
Upvotes: 2