MapMan
MapMan

Reputation: 668

Change xml value in varous places that have the same tag

I have a python script, that uses lxml to change the values of specific tags. I have the following xml

                    <gmd:CI_Citation>
                    <gmd:date>
                        <gmd:CI_Date>
                            <gmd:date>
                                <gco:Date>**1900-01-01**</gco:Date>
                            </gmd:date>
                            <gmd:dateType>
                                <gmd:CI_DateTypeCode codeList="http://standards.iso.org/ittf/PubliclyAvailableStandards/ISO_19139_Schemas/resources/Codelist/gmxCodelists.xml#CI_DateTypeCode" codeListValue="publication">Publication</gmd:CI_DateTypeCode>
                            </gmd:dateType>
                        </gmd:CI_Date>
                    </gmd:date>
                    <gmd:date>
                        <gmd:CI_Date>
                            <gmd:date>
                                <gco:Date>**1900-01-01**</gco:Date>
                            </gmd:date>
                            <gmd:dateType>
                                <gmd:CI_DateTypeCode codeList="http://standards.iso.org/ittf/PubliclyAvailableStandards/ISO_19139_Schemas/resources/Codelist/gmxCodelists.xml#CI_DateTypeCode" codeListValue="creation">Creation</gmd:CI_DateTypeCode>
                            </gmd:dateType>
                        </gmd:CI_Date>
                    </gmd:date>
                    <gmd:date>
                        <gmd:CI_Date>
                            <gmd:date>
                                <gco:Date>**1900-01-01**</gco:Date>
                            </gmd:date>
                            <gmd:dateType>
                                <gmd:CI_DateTypeCode codeList="http://standards.iso.org/ittf/PubliclyAvailableStandards/ISO_19139_Schemas/resources/Codelist/gmxCodelists.xml#CI_DateTypeCode" codeListValue="revision">Revision</gmd:CI_DateTypeCode>
                            </gmd:dateType>
                        </gmd:CI_Date>
                    </gmd:date>
                </gmd:CI_Citation>

For each different date type (Publication, Creation and Revision) I want to change the date to a specific date, however the tags for all 3 are the same -

//:gmd_citation/:gmd_CI:Citation/:gmd_date/:gmd_CI_Date/:gmd_date/:gco_Date

I am using the following function to change the values

def updateXMLTag (tag, value):
  xmlValue = root.xpath(tag)
  xmlValue[0].text = str(value)

What is the best way using xpath to get to the specific tag, so that the value can be changed?

Upvotes: 0

Views: 102

Answers (1)

swatchai
swatchai

Reputation: 18762

This is my way of using xpath to get to the specific elements, and edit them:

# Find the best implementation available on the platform

try:
    from cStringIO import StringIO
except:
    from StringIO import StringIO

from lxml import etree

# proper namespaces added to get valid xml
xmlstr = StringIO("""<gmd:CI_Citation xmlns:gmd="http://gmd.example.com" xmlns:gco="http://gco.example.com">
        <gmd:date>
        <gmd:CI_Date>
            <gmd:date>
                <gco:Date>1900-01-01</gco:Date>
            </gmd:date>
            <gmd:dateType>
                <gmd:CI_DateTypeCode codeList="http://standards.iso.org/ittf/PubliclyAvailableStandards/ISO_19139_Schemas/resources/Codelist/gmxCodelists.xml#CI_DateTypeCode" codeListValue="publication">Publication</gmd:CI_DateTypeCode>
            </gmd:dateType>
        </gmd:CI_Date>
    </gmd:date>
    <gmd:date>
        <gmd:CI_Date>
            <gmd:date>
                <gco:Date>1900-01-01</gco:Date>
            </gmd:date>
            <gmd:dateType>
                <gmd:CI_DateTypeCode codeList="http://standards.iso.org/ittf/PubliclyAvailableStandards/ISO_19139_Schemas/resources/Codelist/gmxCodelists.xml#CI_DateTypeCode" codeListValue="creation">Creation</gmd:CI_DateTypeCode>
            </gmd:dateType>
        </gmd:CI_Date>
    </gmd:date>
    <gmd:date>
        <gmd:CI_Date>
            <gmd:date>
                <gco:Date>1900-01-01</gco:Date>
            </gmd:date>
        <gmd:dateType>
            <gmd:CI_DateTypeCode codeList="http://standards.iso.org/ittf/PubliclyAvailableStandards/ISO_19139_Schemas/resources/Codelist/gmxCodelists.xml#CI_DateTypeCode" codeListValue="revision">Revision</gmd:CI_DateTypeCode>
            </gmd:dateType>
        </gmd:CI_Date>
    </gmd:date>
</gmd:CI_Citation>""")

tree = etree.parse(xmlstr)

Here we use xpath to get all the (3) target elements.

targets = tree.xpath('/gmd:CI_Citation/gmd:date/gmd:CI_Date/gmd:dateType/gmd:CI_DateTypeCode', \
           namespaces={'gmd': "http://gmd.example.com", 'gco': "http://gco.example.com"})

The three elements are distinguished by unique attribute values, which can be checked with a simple function hasattr

def hasattr(elem, att, val):
    try:
        return elem.attrib[att] == val
    except:
        return False

targets[0] codeListValue/ text node: "publication"/ "Publication"

targets[1] codeListValue/ text node: "creation"/ "Creation"

targets[2] codeListValue/ text node: "revision"/ "Revision"

Which one needs changes?

hasattr(targets[0], 'codeListValue', 'publication')  # True
hasattr(targets[1], 'codeListValue', 'creation')  # True
hasattr(targets[2], 'codeListValue', 'publication')  # False

# Let's change one of them
t1 = targets[1]
t1.text = 'New Creation'  # change text node

# and/or change attribute
t1.attrib['codeListValue'] = 'Latest Creation'

Finally, we save the result to a file

tree.write("output1.xml")

Edit 1

Here we navigate to cousin1 (gco:Date) of the already found target[1] that needs change:

t1 = targets[1]
parent1 = t1.getparent()
date1 = parent1.getprevious()
cousin1 = date1.getchildren()
len(cousin1)     #1
cousin1[0].text  #'1900-01-01'

# change the date
cousin1[0].text = '2017-5-3'
# again, write the result

tree.write("out456.xml")

Upvotes: 1

Related Questions