mdmac
mdmac

Reputation: 51

regex \s and \S explanations

Not part of any code but just trying to understand regex better

import re
test=re.findall('\s[0-9]+','hello 23and 4world ')
print test # works correctly 
[' 23', ' 4']

but

import re
test=re.findall('\S[0-9]+','hello 23and 4world ')
print test

I expected this output to be [] since '\S' matches any non whitespace character but the output is ['23']. Any explanations will be helpful.

Upvotes: 1

Views: 83

Answers (1)

HamZa
HamZa

Reputation: 14931

2 is a digit but also a non white space character. \S matches 2 and [0-9]+ matches 3:

hello 23and 4world 
      ^^-[0-9]+
      ^--\S

Which means 1234 would also get matched in hello 1234and 4world.

One way to "debug" this quickly is to use groups and an online tester: (\S)([0-9]+).

Upvotes: 1

Related Questions