Reputation: 490
I have XML data in this format:
<Slot_Data Timestamp="08-18-2017 07:03:20.890">
<Slot Id="1" Count="23" Error="4" />
<Slot Id="2" Count="31" Error="0" />
<Slot Id="3" Count="27" Error="2" />
</Slot_Data>
<Slot_Data Timestamp="08-18-2017 07:55:54.574">
<Slot Id="1" Count="21" Error="0" />
<Slot Id="2" Count="23" Error="3" />
<Slot Id="3" Count="34" Error="1" />
</Slot_Data>
I'm trying to arrange it in this format and output to CSV:
Timestamp Slot Count Error
08/18/17 07:03:21 1 23 4
08/18/17 07:03:21 2 31 0
08/18/17 07:03:21 3 27 2
08/18/17 07:55:55 1 21 0
08/18/17 07:55:55 2 23 3
08/18/17 07:55:55 3 34 1
I can get the child attributes into the CSV format above (minus the Timestamp) using etree:
tree = ET.parse(xml_file)
root = tree.getroot()
for line in root.iter('Slot'):
row = []
id = line.get('Id')
row.append(id)
count = line.get('Count')
row.append(count)
error = line.get('Error')
row.append(error)
csvwriter.writerow(row)
But I can't figure out how to also append the element's timestamp. I can print them easily using etree, but I'm not sure how to work that into the above Python code. Any ideas? Thanks!
Upvotes: 0
Views: 1530
Reputation: 3954
I think objectify
module from lxml
library is the way to go.
from lxml import objectify
s = '''<document><Slot_Data Timestamp="08-18-2017 07:03:20.890">
<Slot Id="1" Count="23" Error="4" />
<Slot Id="2" Count="31" Error="0" />
<Slot Id="3" Count="27" Error="2" />
</Slot_Data>
<Slot_Data Timestamp="08-18-2017 07:55:54.574">
<Slot Id="1" Count="21" Error="0" />
<Slot Id="2" Count="23" Error="3" />
<Slot Id="3" Count="34" Error="1" />
</Slot_Data></document>'''
mo = objectify.fromstring(s)
lines_data = [ (sd.get('Timestamp'), sl.get('Id'), sl.get('Count'), sl.get('Error'))
for sd in mo.Slot_Data
for sl in sd.Slot]
Notice I had to add the document
tag to be able to parse the string (a root node is needed).
Now lines_data
has all the data you need in a list of tuples, and you can write the data using csv library or formatting it yourself. For example:
with open('myfile.csv', 'w') as f:
file_contents = '\n'.join( '%s,%s,%s,%s'%l for l in lines_data )
f.write(file_contents)
Upvotes: 2