Diljit PR
Diljit PR

Reputation: 311

Difference between * and + in python regex?

I am quite new to python and practicing a few regular expression from google python exercies A section from there says :

Repetition

Things get more interesting when you use + and * to specify repetition in the pattern

+ -- 1 or more occurrences of the pattern to its left, e.g. i+ = one or more i's

* -- 0 or more occurrences of the pattern to its left

So I decided to try out some samples .

import re
text="xx1  2   3xx"
match = re.search(r'\d\s+\d\s+\d',text)
match.group()

and this yields the following output :

1  2   3

and to understand the difference between * and + I tried this :

import re
text="xx1  2   3xx"
match = re.search(r'\d\s*\d\s*\d',text)
match.group()

and yielded the output :

1  2   3

(the same as output 1)

I still wonder,whether these : * and + really have same function? If not examples are much appreciated!

Please correct me if I'm wrong somewhere.

Upvotes: 2

Views: 7265

Answers (3)

falsetru
falsetru

Reputation: 369044

* matches 0 or more repetitions of the preceding RE, while + 1 or more repetitions of the preceding RE.

>>> re.findall(r'a\s*b', 'ab')
['ab']
>>> re.findall(r'a\s+b', 'ab')
[]

>>> re.findall(r'a\s*b', 'a b')
['a b']
>>> re.findall(r'a\s+b', 'a b')
['a b']

Upvotes: 4

Zaheer Ahmed
Zaheer Ahmed

Reputation: 28528

Here is nice article

  • * matches 0 or more

    Causes the resulting RE to match 0 or more repetitions of the preceding RE, as many repetitions as are possible. ab* will match ‘a’, ‘ab’, or ‘a’ followed by any number of ‘b’s.

  • + matches at least 1 or more

    Causes the resulting RE to match 1 or more repetitions of the preceding RE. ab+ will match ‘a’ followed by any non-zero number of ‘b’s; it will not match just ‘a’.

Python docs to refer

Upvotes: 0

BrenBarn
BrenBarn

Reputation: 251363

Like it says, * matches zero or more, + matches one or more. But you only tested them on a case where there is more than one space, which means they both apply. Compare how your regexes work on "xx12 3xx" where there are zero spaces between the first two digits.

Upvotes: 5

Related Questions