Daan Timmer
Daan Timmer

Reputation: 15057

'combining' results as a node entry using xpath

I am trying to solve a problem/implementation using xpath. Using code (python) is of course also possible, but I prefer to do it in xpath if possible

Right, say I've got a xml file that looks a bit like this:

<?xml version="1.0" encoding="UTF-8"?>
<dir>
    <results>
        <entrylist>
            <entry>
                <type>document</type>
                <name>a file name 1</name>
                <date>2012-01-01</date>
                <size>65421316516</size>
            </entry>
            <entry>
                <type>document</type>
                <name>a file name 2</name>
                <date>2012-01-02</date>
                <size>6542131</size>
            </entry>
            <entry>
                <type>document</type>
                <name>a file name 3</name>
                <date>2012-01-03</date>
                <size>654</size>
            </entry>
        </entrylist>
    </results>
</dir>

I can not change the layout of the xml
From this xml I need to extract the name and date of each entry. I somewhat prefer them to be grouped together without the type/size in the result returned by my xpath function.

So to sum it up, I need(want) an output that looks a bit like this:

[0]
| - name: a file name 1
| - date: 2012-01-01

[1]
| - name: a file name 2
| - date: 2012-01-02
etc

Is this in any possible way even possible? or I am I stuck with just using a xmldocument parser in python? (using etree from lxml)

Upvotes: 0

Views: 110

Answers (2)

Michael Kay
Michael Kay

Reputation: 163468

Consider using XQuery, which is a superset of XPath, and allows you to construct new XML documents containing structured information.

Upvotes: 0

Lev Levitsky
Lev Levitsky

Reputation: 65811

I'm not sure this is what you would like, but:

In [1]: from lxml.etree import parse

In [2]: tree = parse('/tmp/test.xml')

In [3]: for entry in tree.xpath('/dir/results/entrylist/entry'):
   ...:     print entry.xpath('name|date')
   ...:
[<Element name at 0x2ce7d70>, <Element date at 0x2ce7dc0>]
[<Element name at 0x2ce7dc0>, <Element date at 0x2ce7c30>]
[<Element name at 0x2ce7c30>, <Element date at 0x2ce7d70>]

AFAIK, XPath is for selecting nodes, not combining them, so I don't think it can do all of the job for you.

Upvotes: 1

Related Questions