taki
taki

Reputation: 109

Regex to match price in a sentence

I tried to catch price with python but my regex returns nothing.

The sentence is : "Word1 WordA WordB WordC ... WordX : Price €". we know the Word1 but we don't know Word A to X neither the Price (4 digits with "," "." or nothing between 1st and 2nd digits).

we need to get the Price number just before the "€" which is following Word1...

I've created this :

regex = "(Word1) ([a-zA-Z])+ ( :)? ([0-9]{0,4})+ €"

Which matches on :

Word1 zerdezd : 1243 €

Word1 zerdezd 1243 €

But not on

Word1 zerdezd ezrozeu : 1243 €

And this doesn't work...

(Charges) (([a-zA-Z])+ )+( :){0,1} ([0-9]{0,4})+ €

Upvotes: 0

Views: 189

Answers (2)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626747

You can use

Word1\s.*?(\d+(?:[,.]\d+)?)\s*(?:€|euro)

See the regex demo.

In Python:

rx = r'Word1\s.*?(\d+(?:[,.]\d+)?)\s*(?:€|euro)'
m = re.search(rx, text)
if m:
  print(m.group(1)) # prints the price

# or
print(re.findall(rx, text))

Upvotes: 2

The fourth bird
The fourth bird

Reputation: 163277

You could match Word followed by as least as possible characters until you can capture the amount in group 1 that is followed by a euro sign.

\bWord1 .*?\b([0-9]{1,4}(?:[.,]\d+)?) €

The pattern matches:

  • \bWord1 .*? Match Word followed by a space and as least as possible chars
  • \b( Word boundary, start group 1
    • [0-9]{1,4}(?:[.,]\d+)? Match 1-4 digits with an optional decimal part
  • ) Close group 1
  • Match literally (or use \s*€ if there can be 0 or more whitespace chars)

Regex demo

Example

import re
 
regex = r"\bWord1 .*?\b([0-9]{1,4}(?:[.,]\d+)?) €"
 
s = ("Word1 zerdezd : 1243 €\n"
    "Word1 zerdezd 1243 €\n"
    "Word1 zerdezd ezrozeu : 1243 €")
print(re.findall(regex, s))

Output

['1243', '1243', '1243']

Python demo

Upvotes: 2

Related Questions