Mag-Stellon
Mag-Stellon

Reputation: 3

Regex, split and Python

Can anybody help me to understand what those lines are doing ?

VAR_TOKEN_START = '{{'
VAR_TOKEN_END = '}}'
BLOCK_TOKEN_START = '{%'
BLOCK_TOKEN_END = '%}'
TOK_REGEX = re.compile(r"(%s.*?%s|%s.*?%s)" % (
    VAR_TOKEN_START,
    VAR_TOKEN_END,
    BLOCK_TOKEN_START,
    BLOCK_TOKEN_END
))

TOK_REGEX.split('{% each vars %}<i>{{it}}</i>{% endeach %}')

I don't understand the % on the regex expression. And why we split on TOK_REGEX variable expression.

Upvotes: 0

Views: 73

Answers (1)

user2357112
user2357112

Reputation: 282128

This part:

TOK_REGEX = re.compile(r"(%s.*?%s|%s.*?%s)" % (
    VAR_TOKEN_START,
    VAR_TOKEN_END,
    BLOCK_TOKEN_START,
    BLOCK_TOKEN_END
))

uses string formatting to build a regex in a more understandable manner than just a jumble of characters. The % operator replaces each %s with the contents of the corresponding string in the following tuple. This allows the author of the code to give meaningful names to the {{, }}, {%, and %} parts of the regex.

The split call:

TOK_REGEX.split('{% each vars %}<i>{{it}}</i>{% endeach %}')

equivalent to the re.split function with the compiled pattern, finds all occurrences of text matching the regex in the argument string and returns a list of the parts divided by the matches - except that since the regex was in a capturing group (the parentheses in the regex string), the regex matches are also included in the list.

Upvotes: 1

Related Questions