Reputation: 531
I am using below regex expression (with pyparsing), which doesn't give any output. Any idea what I am doing wrong here.
>>> pat = pp.Regex('\s+\w+')
>>> x = " *** abc xyz pqr"
>>> for result, start, end in pat.scanString(x):
print result, start, end
if \s
is removed. We get the data
>>> pat = pp.Regex('\w+')
>>> x = " *** abc xyz pqr"
>>> for result, start, end in pat.scanString(x):
print result, start, end
['abc'] 8 11
['xyz'] 14 17
['pqr'] 20 23
Upvotes: 1
Views: 126
Reputation: 5006
According to this, whitespaces are skipped by default in pyparsing.
During the matching process, whitespace between tokens is skipped by default (although this can be changed).
But Regex class inherits from ParserElement which has a leaveWhitespace() method.
leaveWhitespace(self) source code
Disables the skipping of whitespace before matching the characters in the ParserElement's defined pattern. This is normally only used internally by the pyparsing module, but may be needed in some whitespace-sensitive grammars.
So this code works :
>>> pat = pp.Regex('\s+\w+')
>>> pat.leaveWhitespace()
>>> x = " *** abc xyz pqr"
>>> for result, start, end in pat.scanString(x):
print result, start, end
[' abc'] 4 11
[' xyz'] 11 17
[' pqr'] 17 23
Upvotes: 3