Reputation: 73
I have a CSV file that contains a header row followed by a potentially unlimited number of rows with values. For example:
FieldA,FieldB,FieldC,FieldD
1,asdf,2,ghjk
3,qwer,4,yuio
5,slslkd,,aldkjslkj
What I need to do is for each row, create a quasi-XML string where the elements are labeled as the column name and information within each element is the value of the cell. Using the above as an example, if I iterate through each of the three rows I would end up with these three strings:
<FieldA>1</FieldA><FieldB>asdf</FieldB><FieldC>2</FieldC><FieldD>ghjk</FieldD>
<FieldA>3</FieldA><FieldB>qwer</FieldB><FieldC>4</FieldC><FieldD>yuio</FieldD>
<FieldA>5</FieldA><FieldB>slslkd</FieldB><FieldD>aldkjslkj</FieldD>
The way I am currently doing is is:
for row in r:
if row['FieldA']:
fielda = '<FieldA>{0}</FieldA>'.format(row['FieldA'])
else:
fielda = ''
if row['FieldB']:
fieldb = '<FieldB>{0}</FieldB>'.format(row['FieldB'])
else:
fieldb = ''
if row['FieldC']:
fieldc = '<FieldC>{0}</FieldC>'.format(row['FieldC'])
else:
fieldc = ''
if row['FieldD']:
fieldd = '<FieldD>{0}</FieldD>'.format(row['FieldD'])
else:
fieldd = ''
# Compile the string
final_string = fielda + fieldb + fieldc + fieldd
# Process further
do_something(final_string)
As it iterates through each row, this creates the appropriate string and then I can pass it on for further processing.
Is there a better way to achieve what I want, or is my approach the best way? My guess is there is a better, more Pythonic, and more efficient way, but I'm new-ish to Python.
Thanks.
Upvotes: 2
Views: 1521
Reputation: 82
'Top' is just the highest level node -- you could use whatever text you want to wrap the whole document.
You can pretty-print it pretty simply as well: http://pymotw.com/2/xml/etree/ElementTree/create.html#pretty-printing-xml
Upvotes: -1
Reputation: 73
Slightly modified code that fixed the issue I was having. Turned out to be pretty trivial:
with open(csv_file) as f:
for row in csv.DictReader(f):
top = Element('event')
for k, v in row.items():
child = SubElement(top, k)
child.text = v
print tostring(top)
Thanks for the help!
Upvotes: 2
Reputation: 76326
Python is Batteries Included.
In this case, you can use the csv
module and the xml
module, with code that looks like this:
# CSV module
import csv
# Stuff from the XML module
from xml.etree.ElementTree import Element, SubElement, tostring
# Topmost XML element
top = Element('top')
# Open a file
with open('stuff.csv') as csvfile:
# And use a dictionary-reader
for d in csv.DictReader(csvfile)
# For each mapping in the dictionary
for (k, v) in d.iteritems():
# Create an XML node
child = SubElement(top, k)
child.text = v
print tostring(top)
Upvotes: 1