Mia
Mia

Reputation: 181

Create multiple XMLs based on path and text values stored in csv efficiently

I have a csv file that contains the paths of the XML elements of an xml file that I need to change in the first column. The texts of each new xml file to be created are given in columns 2 - 10,000 on wards.

Path                                                            Edit1       Edit2       Edit3       Edit4       Edit5          ----  Edit1000
".//data/country[@name="Singapore"]/gdpnp[@month="08"]/state",  5.2e-015,   2e-05,      8e-06,      9e-04,      0.4e-05,   
".//data/country[@name="Peru"]/gdppc[@month="06"]/region",      0.04,       0.02,       0.15,       3.24,       0.98,                                                 

I would like to replace the text of the elements of the original XML file (NoEdit.xml), based on the paths in column 1, by values in each subsequent column and name accordingly e.g. XML based on column 2 values will be named Edit2.xml.

import csv
import xml.etree.ElementTree as ET
tree = ET.parse('NoEdit.xml')      
with open('csvlist.csv', 'rb') as csvlist:
    reader = csv.reader(csvlist, delimiter=',')
for x in range(1, 1000):
    for row in reader:
        if reader.line_num == 1: continue # skip the row of headers
        for data in tree.findall(row[0]):
            data.text = row[(x)]
            tree.write('Edit(x).xml')

Based on help on this forum q1 q2 I have gotten this far @ the code below. I get the errors KeyError: '".//data/country[@name="'. when I use a fixed path I still get error on findall or I just don't get the right xml.

I would appreciate any help regards direction with this. Please feel free to suggest alternate methods of doing this as well.

Upvotes: 0

Views: 64

Answers (1)

Charles Duffy
Charles Duffy

Reputation: 295510

This is not valid CSV:

".//data/country[@name="Singapore"]/gdpnp[@month="08"]/state",

Instead, it should be:

".//data/country[@name=""Singapore""]/gdpnp[@month=""08""]/state",

Notably, any literal " in the data needs to be doubled, to "", to disambiguate it from the ending quotes. (I'm curious how you created that file -- any spreadsheet program or other CSV generator should have gotten it right).


I would also strongly suggest using lxml.etree here and its .xpath() call; .findall() is not real XPath.

Upvotes: 1

Related Questions