Reputation: 2317
I want to match a floating number that might be in the form of 0.1234567 or 1.23e-5 Here is my python code:
import re
def main():
m2 = re.findall(r'\d{1,4}:[-+]?\d+\.\d+(e-\d+)?', '1:0.00003 3:0.123456 8:-0.12345')
for svs_elem in m2:
print svs_elem
main()
It prints blank... Based on my test, the problem was in (e-\d+)? part.
Upvotes: 1
Views: 8619
Reputation: 530970
Use a non-capturing group. The matches are succeeding, but the output is the contents of the optional groups that don't actually match.
See the output when your input includes something like e-6
:
>>> re.findall(r'\d{1,4}:[-+]?\d+\.\d+(e-\d+)?', '1:0.00003 3:0.123456 8:-0.12345e-6')
['', '', 'e-6']
With a non-capturing group ((?:...)
):
>>> re.findall(r'\d{1,4}:[-+]?\d+\.\d+(?:e-\d+)?', '1:0.00003 3:0.123456 8:-0.12345e-6')
['1:0.00003', '3:0.123456', '8:-0.12345e-6']
Here's are some simpler examples to demonstrate how capturing groups work and how they influence the output of findall
. First, no groups:
>>> re.findall("a[bc]", "ab")
["ab"]
Here, the string "ab" matched the regex, so we print everything the regex matched.
>>> re.findall("a([bc])", "ab")
["b"]
This time, we put the [bc]
inside a capturing group, so even though the entire string is still matched by the regex, findall
only includes the part inside the capturing group in its output.
>>> re.findall("a(?:[bc])", "ab")
["ab"]
Now, by converting the capturing group to a non-capturing group, findall
again uses the match of the entire regex in its output.
>>> re.findall("a([bc])?", "a")
['']
>>> re.findall("a(?:[bc])?", "a")
['a']
In both of these final case, the regular expression as a whole matches, so the return value is a non-empty list. In the first one, the capturing group itself doesn't match any text, though, so the empty string is part of the output. In the second, we don't have a capturing group, so the match of the entire regex is used for the output.
Upvotes: 3
Reputation: 224886
See emphasis:
Help on function findall in module re: findall(pattern, string, flags=0) Return a list of all non-overlapping matches in the string. If one or more groups are present in the pattern, return a list of groups; this will be a list of tuples if the pattern has more than one group. Empty matches are included in the result.
You have a group, so it’s returned instead of the entire match, but it doesn’t match in any of your cases. Make it non-capturing with (?:e-\d+)
:
m2 = re.findall(r'\d{1,4}:[-+]?\d+\.\d+(?:e-\d+)?', '1:0.00003 3:0.123456 8:-0.12345')
Upvotes: 4