Twiggy
Twiggy

Reputation: 13

How to sort an xml by a nested child element text value using etree in Python

I've seen variations of this question answered numerous times (Sorting XML in python etree, Sorting xml values using etree) yet cant seem to adapt those answers to my question. I am trying to sort an imported xml file by a specific sub elements tag, in this instance it's by the "id" tag. Below is the xml in question:

INPUT:

    <bookstore Location="New York">              
        <Genre type="Fiction">
            <name>Fiction</name>
            <id>4</id>
            <pages>300</pages>
            </Genre>
        <Genre type="Fiction">
            <name>Fictional Fiction</name>
            <id>2</id>
            <pages>500</pages>
        </Genre>
        <Genre type="Horror">
            <name>Horrors</name>
            <id>1</id>
            <pages>450</pages>
        </Genre>
        <Genre type="Horror">
            <name>Horrendous Horror</name>
            <id>3</id>
            <pages>20</pages>
        </Genre>
        <Genre type="Comedy">
            <name>Comedic Comedy</name>
            <id>0</id>
            <pages>1</pages>
        </Genre>
    </bookstore>

I want to organize all the Genre elements by their child element "id". This is the output I'm going for:

OUTPUT:

    <bookstore Location="New York">              
        <Genre type="Comedy">
            <name>Comedic Comedy</name>
            <id>0</id>
            <pages>1</pages>
        </Genre>
        <Genre type="Horror">
            <name>Horrors</name>
            <id>1</id>
            <pages>450</pages>
        </Genre>
        <Genre type="Fiction">
            <name>Fictional Fiction</name>
            <id>2</id>
            <pages>500</pages>
        </Genre>
        <Genre type="Horror">
            <name>Horrendous Horror</name>
            <id>3</id>
            <pages>20</pages>
        </Genre> 
        <Genre type="Fiction">
            <name>Fiction</name>
            <id>4</id>
            <pages>300</pages>
        </Genre>
    </bookstore>

This is the code I've tried:

    def sortchildrenby(parent):
    parent[:] = sorted(parent, key=lambda child: child.tag == 'id')

    filename = "Example.xml"
    tree = ET.parse(filename)
    root = tree.getroot()                      
    attr = "type"
    for elements in root:
        sortchildrenby(elements)
    tree.write("exampleORGANIZED.xml")

Which results the following xml:

    <bookstore Location="New York">              
        <Genre type="Fiction">
            <name>Fiction</name>
            <pages>300</pages>
            <id>4</id>
            </Genre>
        <Genre type="Fiction">
            <name>Fictional Fiction</name>
            <pages>500</pages>
        <id>2</id>
            </Genre>
        <Genre type="Horror">
            <name>Horrors</name>
            <pages>450</pages>
        <id>1</id>
            </Genre>
        <Genre type="Horror">
            <name>Horrendous Horror</name>
            <pages>20</pages>
        <id>3</id>
            </Genre>
        <Genre type="Comedy">
            <name>Comedic Comedy</name>
            <pages>1</pages>
        <id>0</id>
            </Genre>
    </bookstore>

The ID's were shifted downward and did not re-sort in ascending order.

Upvotes: 1

Views: 911

Answers (1)

Parfait
Parfait

Reputation: 107567

Pass the whole root into method without iteration since you need to sort underlying <Genre> elements not each individual one. Also, adjust method to sort by element text not a boolean expression:

def sortchildrenby(parent, attr):
    parent[:] = sorted(parent, key=lambda child: child.find(attr).text)

tree = ET.parse("Input.xml")
root = tree.getroot()
                    
sortchildrenby(root, "id")
    
ET.indent(tree, space="\t", level=0)   # PRETTY PRINT (ADDED Python 3.9)
tree.write("Output.xml")

Output

<bookstore Location="New York">
    <Genre type="Comedy">
        <name>Comedic Comedy</name>
        <id>0</id>
        <pages>1</pages>
    </Genre>
    <Genre type="Horror">
        <name>Horrors</name>
        <id>1</id>
        <pages>450</pages>
    </Genre>
    <Genre type="Fiction">
        <name>Fictional Fiction</name>
        <id>2</id>
        <pages>500</pages>
    </Genre>
    <Genre type="Horror">
        <name>Horrendous Horror</name>
        <id>3</id>
        <pages>20</pages>
    </Genre>
    <Genre type="Fiction">
        <name>Fiction</name>
        <id>4</id>
        <pages>300</pages>
    </Genre>
</bookstore>

Upvotes: 1

Related Questions