Reputation: 1012
I am trying to capture all numbers with a following format:
.
or ,
,
I have the following regex: (?<!\d)[\d]{1,5}(?!\d)[.,][\d]{2,3}[,]*[\d]*
and it should match:
7,93
8.32
20,43
100.23
2.800
1.597,72
2.026,88
33.000
33.000,43
100.000
150,000
150.000,50
what it should not match:
7.3.2011
07.03.2011
3.2011
I have tested my regex with a following example string:
7.3.2011 zwischen 7,93 und 10,53 EUR Dienstbeginn: 07.03.2011
or in code:
import re
string = '7.3.2011 zwischen 7,93 und 10,53 EUR Dienstbeginn: 07.03.2011'
salary = r"(?<!\d)[\d]{1,5}(?!\d)[.,][\d]{2,3}[,]*[\d]*"
print(re.findall(salary, string))
Unfortunately it matched 3.2011
and 07.03
. I don't understand why did it match 3.2011
? I defined, that after first .
it should match between 2-3 digits, but it matched 4. It shouldn't match 07.03
either, because 07.03.2011
has wrong format(what I don't want to match)
Can you explain me what did I do wrong? Can you please correct my mistake?
Upvotes: 0
Views: 39
Reputation: 163217
You can exclude matching digits and comma's to the left and right and optionally match a comma followed by 1 or more digits.
Note that the [\d]*
by itself does not have to be between square brackets.
(?<![\d.])\d{1,5}[.,]\d{2,3}(?:,\d+)?(?![\d.])
Explanation
(?<![\d.])
Assert not either a digit or .
to the left\d{1,5}
Match 1-5 digits[.,]\d{2,3}
Match either .
or ,
and 2-3 digits(?:,\d+)?
Optionally match ,
and 1+ digits(?![\d.])
Assert not either a digit or .
to the rightSee a regex demo.
Upvotes: 1