Reputation: 109
I tried to catch price with python but my regex returns nothing.
The sentence is : "Word1 WordA WordB WordC ... WordX : Price €". we know the Word1 but we don't know Word A to X neither the Price (4 digits with "," "." or nothing between 1st and 2nd digits).
we need to get the Price number just before the "€" which is following Word1...
I've created this :
regex = "(Word1) ([a-zA-Z])+ ( :)? ([0-9]{0,4})+ €"
Which matches on :
Word1 zerdezd : 1243 €
Word1 zerdezd 1243 €
But not on
Word1 zerdezd ezrozeu : 1243 €
And this doesn't work...
(Charges) (([a-zA-Z])+ )+( :){0,1} ([0-9]{0,4})+ €
Upvotes: 0
Views: 189
Reputation: 626747
You can use
Word1\s.*?(\d+(?:[,.]\d+)?)\s*(?:€|euro)
See the regex demo.
In Python:
rx = r'Word1\s.*?(\d+(?:[,.]\d+)?)\s*(?:€|euro)'
m = re.search(rx, text)
if m:
print(m.group(1)) # prints the price
# or
print(re.findall(rx, text))
Upvotes: 2
Reputation: 163277
You could match Word followed by as least as possible characters until you can capture the amount in group 1 that is followed by a euro sign.
\bWord1 .*?\b([0-9]{1,4}(?:[.,]\d+)?) €
The pattern matches:
\bWord1 .*?
Match Word followed by a space and as least as possible chars\b(
Word boundary, start group 1
[0-9]{1,4}(?:[.,]\d+)?
Match 1-4 digits with an optional decimal part)
Close group 1 €
Match literally (or use \s*€
if there can be 0 or more whitespace chars)Example
import re
regex = r"\bWord1 .*?\b([0-9]{1,4}(?:[.,]\d+)?) €"
s = ("Word1 zerdezd : 1243 €\n"
"Word1 zerdezd 1243 €\n"
"Word1 zerdezd ezrozeu : 1243 €")
print(re.findall(regex, s))
Output
['1243', '1243', '1243']
Upvotes: 2