Reputation: 181
I have a csv file (csvlist.csv) that contains the paths of the XML elements I need to change in the first column. The texts I need to change original node values to are given in columns 2 - 10,000 on wards.
Path Edit1 Edit2 Edit3 Edit4 Edit5 ---- Edit1000
".//data/country[@name="Singapore"]/gdpnp[@month="08"]/state", 5.2e-015, 2e-05, 8e-06, 9e-04, 0.4e-05,
".//data/country[@name="Peru"]/gdppc[@month="06"]/region", 0.04, 0.02, 0.15, 3.24, 0.98,
I would like to replace the text of the elements of the original XML file (NoEdit.xml), based on the paths in column 1, by values in each subsequent column and name accordingly e.g. XML based on column 2 values will be named Edit2.xml.
import csv
import xml.etree.ElementTree as ET
tree = ET.parse('NoEdit.xml')
with open('csvlist.csv', 'rb') as csvlist:
reader = csv.reader(csvlist, delimiter=',')
for x in range(1, 1000):
for row in reader:
if reader.line_num == 1: continue # skip the row of headers
for data in tree.findall(row[0]):
data.text = row[(x)]
tree.write('Edit(x).xml')
Based on help on this forum q1 q2 I have gotten this far @ the code below. I get the error:
KeyError: '".//data/country[@name="'
When I give a fixed path to remove this error I still get error on findall
or I just don't get the right xml.
I would appreciate any help regards direction with this. Please feel free to suggest alternate methods of doing this as well. Thanks
Upvotes: 1
Views: 486
Reputation: 36272
First of all, you should provide a reproducible example, to help other users to help you. I've done some of the work for you and I've created a test csv
file, like:
Path, Edit1, Edit2
".//first", 5.2e-015, 2e-05
".//second", 0.04, 0.02
Note that I added commas to the header, because you missed them, and I don't know if it was intentional or typing error.
Also I've created a small xml
file:
<root>
<first>1</first>
<second>2</second>
<third>3</third>
</root>
And the python script:
import csv
from lxml import etree
import sys
xmldata = open(sys.argv[2], newline='').read();
with open(sys.argv[1], newline='') as csvfile:
for i, pivoted_row in enumerate(zip(*csv.reader(csvfile, delimiter=','))):
if i == 0:
xpaths = pivoted_row
continue
pivoted_row = [c.strip() for c in pivoted_row]
tree = etree.fromstring(xmldata)
with open(pivoted_row[0] + ".xml", 'wb') as outfile:
for j in range(1, len(xpaths)):
tree.xpath(xpaths[j])[0].text = pivoted_row[j]
outfile.write(etree.tostring(tree))
The key part is to pivot the csv
so I can handle all the data to write to the same file at once. So this way the first column (xpaths) will be first row, so I save them in a variable, that traverse for each other row.
Run it like:
python3 script.py csvfile xmlfile
It creates two files, Edit1.xml
and Edit2.xml
, with following content:
==> Edit1.xml <==
<root>
<first>5.2e-015</first>
<second>0.04</second>
<third>3</third>
</root>
==> Edit2.xml <==
<root>
<first>2e-05</first>
<second>0.02</second>
<third>3</third>
</root>
I hope it can be useful and set you in the good way to solve your problem.
Upvotes: 1