Sean2148
Sean2148

Reputation: 365

How to write a regex pattern to match string with currency sybol at end OR start of string

I'm trying to write a regex pattern in Python that will only match the substrings which have a currency symbol/word attached.

String example:

'50,000 and £200.6m, 50p, 500m euro, 800bn euros, $15bn and $99.99. The year 2006 is 20% larger.' 

Expected matches:

  1. £200.6m
  2. 50p
  3. 500m euro
  4. 800bn euros
  5. $15bn
  6. $99.99

I do not want numbers that are not currency related to match, such as plain numbers, percentages or years.

My attempt:

(^(£|\$)?)\d+([.,]\d*)?|(\d+([.,]\d*)?(p|(bn|m)|(\seuro(s)))$)

My matches:

  1. 50,000

Evidently the regex isn't working at all as I expect. It should either match a substring if it begins with a currency symbol or if it ends with one.

Upvotes: 0

Views: 859

Answers (2)

MonkeyZeus
MonkeyZeus

Reputation: 20737

Something like this would do it:

(?:£|\$)(?:\d*\.)?\d+(?:m|bn)?|(?:\d*\.)?\d+(?:m|bn)? ?(?:p|euros?)

Just keep adding the currencies you wish to catch to the respective (?:£|\$) or (?:p|euros?) sections. Ditto for adding items to (?:m|bn)

https://regex101.com/r/3K9jmR/1

Upvotes: 2

The fourth bird
The fourth bird

Reputation: 163362

You might use a pattern to match either the euro or dollar sign [£$], the value optionally followed by p m or bn.

Or match the value followed by p m or bn and euro with optional s

(?:[£$]\d+(?:\.\d+)?(?:[pm]|bn)?|\d+(?:\.\d+)?(?:[pm]|bn)(?: euros?)?)

Explanation

  • (?: Non capture group
    • [£$] A character class matching either £ or $
    • \d+(?:\.\d+)? Match 1+ digits with an optional decimal part
    • (?:[pm]|bn)? Optionally match either p or m or bn
    • | Or
    • \d+(?:\.\d+)? Match 1+ digits with optional decimal part
    • (?:[pm]|bn) Match either p or m or bn
    • (?: euros?)? Optionally match a space and euro with optional s
  • ) Close group

Regex demo

Upvotes: 4

Related Questions