Regex for price doesn't work

Question

I need a regex which matches any number followed by a string which consists of digits, spaces, dots and commas followed by "Kč" or "Eur".

The problem is that my regex sometimes doesn't find all such strings.

((\d[., \d]+)(Kč|Eur))

For example:

re.findall("""((\d[., \d]+)(Kč|Eur))""","Letenky od 12 932 Kč",flags=re.IGNORECASE)

returns nothing instead of [(12 932 Kč,12 932,Kč)]

Do you know what is wrong with the regex?

Wiktor Stribiżew · Accepted Answer

Your input string contains a multibyte letter consisting of a base c letter and a diacritic, and the regex contains the precompose letter with Unicode code point \u010D.

You may use

(\d(?:[., \d]*\d)?)\s*(K(?:c\u030C|\u010D)|Eur)

Or

(\d[., \d]*)\s*(K(?:č|č)|Eur))

See the regex (second regex demo) and Python demo.

Pattern details

\d - a digit
(?:[., \d]*\d)? - an optional occurrence of
- [., \d]* - zero or more digits, spaces, . or ,
- \d - a digit
\s* - 0 or more whitespaces
(?:K(?:c\u030C|\u010D)|Eur) - either K followed with either c\u030C or \u010D, or Eur values.

When defining the currency regex, use CZK = ['Czk','K(?:č|č)'] or CZK = ['Czk', r'K(?:c\u030C|\u010D)'].

Regex for price doesn't work

Answers (2)

Related Questions

Regex for price doesn&#39;t work

Answers (2)

Related Questions

Regex for price doesn't work