Reputation: 1961
I want to grab the whole number out of this string <some>some 344.3404.3 numbers<tag>
.
Using the Pythex emulator website this works with [\d\.]*
(a digit or point repeated zero or more times).
In Python i get back the whole string:
Input:
import re
re.match(r'[\d\.]*', '<some>some 344.3404.3 numbers<tag>').string
Output:
'<some>some 344.3404.3 numbers<tag>'
What am i missing?
Running python 3.3.5, win7, 64bit.
Upvotes: 1
Views: 46
Reputation: 336118
The string
attribute of a regex match object contains the input string of the match, not the matched content.
If you want the (first) matching part, you need to change three things:
re.search()
because re.match()
will only find a match at the start of the string,group()
method of the match object,+
instead of *
or you'll get an empty (zero-length) match unless the match happens to be at the start of the string.Therefore, use
>>> re.search(r'[\d.]+', '<some>some 344.3404.3 numbers<tag>').group()
'344.3404.3'
or
>>> re.findall(r'[\d.]+', '<some>some 344.3404.3 numbers more 234.432<tag>')
['344.3404.3', '234.432']
if you expect more than one match.
Upvotes: 2
Reputation: 89547
You can use this:
re.search(r'[\d.]+', '<some>some 344.3404.3 numbers<tag>').group()
Notes: Your pattern didn't work because [\d.]*
will match the empty string at the first position. This is why I have replaced the quantifier with +
and changed the method from match to search.
There is no need to escape the dot inside a character class, since it is seen by default as a literal character.
Upvotes: 2