ValueError
ValueError

Reputation: 125

add own text inside nested braces + exception

Original question locates here, current question is desire to avoid one problem.

I have this code which works perfect with html_1 data:

from pyparsing import nestedExpr, originalTextFor

html_1 = '''
<html>
<head>
<title><?php echo "title here"; ?></title>
<head>
    <body>
        <h1 <?php echo "class='big'" ?>>foo</h1>
    </body>
</html>
'''

html_2 = '''
<html>
<head>
<title><?php echo "title here"; ?></title>
<head>
    <body>
        <h1 <?php echo $tpl->showStyle(); ?>>foo</h1>
    </body>
</html>
'''

nested_angle_braces = nestedExpr('<', '>')

# for match in nested_angle_braces.searchString(html):
#     print(match)

# nested_angle_braces_with_h1 = nested_angle_braces().addCondition(
#                                             lambda tokens: tokens[0][0].lower() == 'h1')

nested_angle_braces_with_h1 = originalTextFor(
    nested_angle_braces().addCondition(lambda tokens: tokens[0][0].lower() == 'h1')
    )
nested_angle_braces_with_h1.addParseAction(lambda tokens: tokens[0] + 'MY_TEXT')

print(nested_angle_braces_with_h1.transformString(html_1))

Result of html_1 variable is:

<html>
<head>
<title><?php echo "title here"; ?></title>
<head>
    <body>
        <h1 <?php echo "class='big'" ?>>MY_TEXTfoo</h1>
    </body>
</html>

Here is all right, all placed as expected. MY_TEXT located in right region (inside h1 tag).

But let's see result for html_2:

<html>
<head>
<title><?php echo "title here"; ?></title>
<head>
    <body>
        <h1 <?php echo $tpl->showStyle(); ?>MY_TEXT>foo</h1>
    </body>
</html>

Now we got error, MY_TEXT placed inside h1 property area because PHP contains brace inside "$tpl->".

How I can fix it? I need get this result in that region:

<h1 <?php echo $tpl->showStyle(); ?>>MY_TEXTfoo</h1>

Upvotes: 1

Views: 56

Answers (1)

PaulMcG
PaulMcG

Reputation: 63729

The solution requires that we define a special expression for PHP tags, which our simple nestedExpr gets confused by.

# define an expression for a PHP tag
php_tag = Literal('<?') + 'php' + SkipTo('?>', include=True)

We'll need more than simple strings now for the opener and closer, including a negative lookahead when matching a '<' to make sure we aren't at the leading edge of a PHP tag:

# define expressions for opener and closer, such that  we don't 
# accidentally interpret a PHP tag as a nested  expr
opener = ~php_tag + Literal("<")
closer = Literal(">")

If opener and closer aren't simple strings, then we need to give a content expression too. Our content will be very simple to define, just PHP tags or other Words of printables, excluding '<' and '>' (you'll end up wrapping this all back up in originalTextFor anyway):

# define nested_angle_braces to potentially contain PHP tag, or 
# some other printable (not including '<' or '>' chars)
nested_angle_braces = nestedExpr(opener, closer, 
                                 content=php_tag | Word(printables, excludeChars="<>"))

Now if I use nested_angle_braces.searchString to scan html_2, I get:

for tag in originalTextFor(nested_angle_braces).searchString(html_2):
    print(tag)

['<html>']
['<head>']
['<title>']
['</title>']
['<head>']
['<body>']
['<h1 <?php echo $tpl->showStyle(); ?>>']
['</h1>']
['</body>']
['</html>']

Upvotes: 1

Related Questions