actionshrimp
actionshrimp

Reputation: 5229

Pythonic way to replace text with xml nodes

I'm wondering if anyone can come up with a more 'pythonic' solution to the problem I'm currently trying to solve.

I've got a source XML file that I'm writing an XSLT generator for. The relevant part of the source XML looks like this:

...
<Notes>
    <Note>
        <Code>ABC123</Code>
        <Text>Note text contents</Text>
        ...
    </Note>
    <Note>
        ...
    </Note>
    ...
</Notes>
...

And I have some objects anaologous to these:

from lxml.builder import ElementMaker

#This element maker has the target output namespace
TRGT = ElementMaker(namespace="targetnamespace")
XSL = ElementMaker(namespace="'http://www.w3.org/1999/XSL/Transform',
                   nsmap={'xsl':'http://www.w3.org/1999/XSL/Transform'})

#This is the relevant part of the 'generator output spec'
details = {'xpath': '//Notes/Note', 'node': 'Out', 'text': '{Code} - {Text}'}

The aim is to generate the following snippet of XSLT from the 'details' object:

<xsl:for-each select="//Notes/Note">
    <Out><xsl:value-of select="Code"/> - <xsl:value-of select="Text"/></Out>
</xsl:for-each>

The part I'm having difficulty doing nicely is replacing the {placeholder} text with XML nodes. I initially tried doing this:

import re
text = re.sub('\{([^}]*)\}', '<xsl:value-of select="\\1"/>', details['text'])
XSL('for-each', 
    TRGT(node, text)
    select=details['xpath'])

but this escapes the angle bracket characters (and even if it had worked, if I'm being fussy it means my nicely namespaced ElementMakers are bypassed which I don't like):

<xsl:for-each select="//Notes/Note">
    <Out>&lt;xsl:value-of select="Code"/&gt; - &lt;xsl:value-of select="Text"/&gt;</Out>
</xsl:for-each>

Currently I have this, but it doesnt feel very nice:

start = 0
note_nodes = []

for match in re.finditer('\{([^}]*)\}', note):
    text_up_to = note[start:match.start()]
    match_node = self.XSL('value-of', select=note[match.start()+1:match.end()-1])
    start = match.end()

    note_nodes.append(text_up_to)
    note_nodes.append(match_node)

text_after = note[start:]
note_nodes.append(text_after)

XSL('for-each', 
    TRGT(node, *note_nodes)
    select=details['xpath'])

Is there a nicer way (for example to split a regex into a list, then apply a function to the elements which were matches) or am I just being overly fussy?

Thanks!

Upvotes: 0

Views: 191

Answers (1)

unutbu
unutbu

Reputation: 879611

note_nodes=re.split(r'\{(.*?)\}',details['text'])
# ['', 'Code', ' - ', 'Text', '']
note_nodes=[n if i%2==0 else XSL('value-of',select=n) 
            for i,n in enumerate(note_nodes)]

Upvotes: 1

Related Questions