Reputation: 16526
I just wondering how 'second empty string' came up in the result. could anyone tell me what's happened step by step?
>>> re.split(r'\W*', '...words...')
['', '', 'w', 'o', 'r', 'd', 's', '', '']
If i'm not wrong, first empty match is because of this sentence from python re module document:
If it matches at the start of the string, the result will start with an empty string. The same holds for the end of the string
Upvotes: 4
Views: 103
Reputation: 627082
See the regex demo at regex101: . It shows where matches occur. Now, recalling that
re.split
splits a string the string with the match values (here, empty strings, locations in string), you can easily see where the split occurs:
...
is found and split occurs => ['', 'words...']
w
is found, so \W*
matches the empty space in front of it => ['', '', 'words...']
o
is found, so \W*
matches the empty space in front of it => ['', '', 'w', 'o', 'rds...']
r
is found, so \W*
matches the empty space in front of it => ['', '', 'w', 'o', 'r', 'ds...']
d
is found, so \W*
matches the empty space in front of it => ['', '', 'w', 'o', 'r', 'd', 's...']
s
is found, so \W*
matches the empty space in front of it => ['', '', 'w', 'o', 'r', 'd', 's', '...']
...
is found, so \W*
matches => ['', '', 'w', 'o', 'r', 'd', 's', '']
(note that the last ''
is not just empty string, it is an empty string with end of string position that is still possible to match)\W*
matches this location => ['', '', 'w', 'o', 'r', 'd', 's', '', '']
.Upvotes: 1