Reputation: 43
I have a number of XML-files with a structure like this:
<titles>
<title mode="example" name="name_example">
<titleselect>
<attribute_a>attrib_a</attribute_a>
<attribute_b>attrib_b</attribute_b>
<attribute_c>attrib_c</attribute_c>
<sort_attribute>New York</sort_attribute>
</titleselect>
</title>
<title mode="another_example" name="another_name">
<titleselect>
<attribute_a>attrib_a</attribute_a>
<attribute_b>attrib_b</attribute_b>
<attribute_c>attrib_c</attribute_c>
<sort_attribute>Boston</sort_attribute>
</titleselect>
</title>
<title mode="final_example" name="final_name">
<titleselect>
<attribute_a>attrib_a</attribute_a>
<attribute_b>attrib_b</attribute_b>
<attribute_c>attrib_c</attribute_c>
<sort_attribute>Chicago</sort_attribute>
</titleselect>
</title>
</titles>
I am trying to sort the "titles" alphabetically by the "sort_attribute". My desired output is like this:
<titles>
<title mode="another_example" name="another_name">
<titleselect>
<attribute_a>attrib_a</attribute_a>
<attribute_b>attrib_b</attribute_b>
<attribute_c>attrib_c</attribute_c>
<sort_attribute>Boston</sort_attribute>
</titleselect>
</title>
<title mode="final_example" name="final_name">
<titleselect>
<attribute_a>attrib_a</attribute_a>
<attribute_b>attrib_b</attribute_b>
<attribute_c>attrib_c</attribute_c>
<sort_attribute>Chicago</sort_attribute>
</titleselect>
</title>
<title mode="example" name="name_example">
<titleselect>
<attribute_a>attrib_a</attribute_a>
<attribute_b>attrib_b</attribute_b>
<attribute_c>attrib_c</attribute_c>
<sort_attribute>New York</sort_attribute>
</titleselect>
</title>
</titles>
Is there anyway to achieve this, preferably using XSLT or Python? I am completely new to the world of XSLT, but I have tried applying a number of solutions from other relevant questions e.g. XSLT sort parent element based on child element attribute to no avail.
Upvotes: 1
Views: 171
Reputation: 81684
If you are still interested in a Python solution, it can be achieved by using ElementTree
.
How it works:
title
nodestitle
nodes in memory based on the sort_attribute
tagtitle
node back to the root element in the correct orderimport xml.etree.ElementTree as ET
def get_sort_attribute_tag_value(node):
return node.find('titleselect').find('sort_attribute').text
with open('test.xml') as f:
xml_node = ET.fromstring(f.read())
title_nodes = xml_node.findall('title')
for title_node in title_nodes:
xml_node.remove(title_node)
title_nodes.sort(key=get_sort_attribute_tag_value)
for title_node in title_nodes:
xml_node.append(title_node)
print(ET.tostring(xml_node).decode())
# in order to save as a new file
with open('new_file.xml', 'w') as f:
f.write(ET.tostring(xml_node).decode())
Outputs:
<titles>
<title mode="another_example" name="another_name">
<titleselect>
<attribute_a>attrib_a</attribute_a>
<attribute_b>attrib_b</attribute_b>
<attribute_c>attrib_c</attribute_c>
<sort_attribute>Boston</sort_attribute>
</titleselect>
</title>
<title mode="final_example" name="final_name">
<titleselect>
<attribute_a>attrib_a</attribute_a>
<attribute_b>attrib_b</attribute_b>
<attribute_c>attrib_c</attribute_c>
<sort_attribute>Chicago</sort_attribute>
</titleselect>
</title>
<title mode="example" name="name_example">
<titleselect>
<attribute_a>attrib_a</attribute_a>
<attribute_b>attrib_b</attribute_b>
<attribute_c>attrib_c</attribute_c>
<sort_attribute>New York</sort_attribute>
</titleselect>
</title>
</titles>
Upvotes: 1
Reputation: 107387
As an XSLT alternative, as per Tomalek's comment, this is fairly straightforward using a template capturing the parent titles
and then sorting by the required sort_attribute
(actually, an element), and copying the inner title
content:
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<!-- identity transform -->
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="titles">
<xsl:copy>
<xsl:apply-templates select="title">
<xsl:sort select="titleselect/sort_attribute" data-type="text" order="ascending"/>
</xsl:apply-templates>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Upvotes: 0