Reputation: 29
I try to concatenate xml attributes, but it only takes the first pair, and then starts with the attributes from a new item. It will make sense when you read through the file.
import os, csv
from xml.etree import ElementTree
file_name = 'data.xml'
full_file = os.path.abspath(os.path.join('xml', file_name))
dom = ElementTree.parse(full_file)
with open('output.csv', 'w', newline="") as f:
writer = csv.writer(f)
writer.writerow(['fruitNumber', 'categoryNumber', 'Group', 'AttributeValueName'])
for d in dom.findall('//item'):
part = d.find('.//item-number').text
name = d.find('.//name').text
value = d.find('.//value').text
writer.writerow([part, '' , '', name + ":" + value])
Here is my xml file:
<?xml version="1.0"?>
<all>
<items>
<item>
<item-number>449</item-number>
<attributes>
<attribute>
<name>FRUIT</name>
<value>Lemon</value>
</attribute>
<attribute>
<name>COLOR</name>
<value>Yellow</value>
</attribute>
</attributes>
</item>
<item>
<item-number>223</item-number>
<attributes>
<attribute>
<name>FRUIT</name>
<value>Orange</value>
</attribute>
<attribute>
<name>COLOR</name>
<value>Orange</value>
</attribute>
</attributes>
</item>
</items>
</all>
Here is what I get:
fruitNumber categoryNumber Group AttributeValueName
449 FRUIT:Lemon
223 FRUIT:Orange
Here is what I am trying to get:
fruitNumber categoryNumber Group AttributeValueName
449 FRUIT:Lemon│COLOR:Yellow
223 FRUIT:Orange│COLOR:Orange
Thanks for your help in advance!!!
Upvotes: 0
Views: 1418
Reputation: 177795
You're only reading the first attribute of each item. You need to additionally search the attributes under the item, collect them, then format them as you require when writing the row:
import os, csv
from xml.etree import ElementTree
file_name = 'data.xml'
full_file = os.path.abspath(os.path.join('xml', file_name))
dom = ElementTree.parse(full_file)
with open('output.csv', 'w', newline="") as f:
writer = csv.writer(f)
writer.writerow(['fruitNumber', 'categoryNumber', 'Group', 'AttributeValueName'])
for d in dom.findall('.//item'):
part = d.find('.//item-number').text
L = []
for a in d.findall('.//attribute'):
name = a.find('.//name').text
value = a.find('.//value').text
L.append('{}:{}'.format(name,value))
writer.writerow([part, '' , '', '|'.join(L)])
Output
fruitNumber,categoryNumber,Group,AttributeValueName 449,,,FRUIT:Lemon|COLOR:Yellow 223,,,FRUIT:Orange|COLOR:Orange
Upvotes: 1