Fabio Correa
Fabio Correa

Reputation: 1363

How do I apply a look back regex on a pattern that may have variable characters inside?

I need to gather the two values 36.12 and 25.40 in the following string:

original discount of 9.17 % (amount with discount: USD 36.12) and negociated discount of 36.12 % (amount with discount: USD 25.40), delivery in 15 days

Observe that both quantities are preceeded by the same char string amount with discount: USD, the labels for the desired values are original discount and negociated discount.

For the first desired value I tried (?<=original discount of ).*\) that correctly capture 9.17 % (amount with discount: USD 36.12), then appended ((?<=amount with discount: USD).*) (resulting in full regex (?<=original discount of ).*\)((?<=amount with discount: USD).*) ) to capture the 36.12, but it does not work (tried the same for the second desired value, changing original to negotiated).

Any hints on this? Is there an easier way?

Upvotes: 2

Views: 54

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626926

You may capture both parts you need:

((?:negociated|original) discount).*?\bUSD\s*(\d+(?:\.\d+)?)

See the regex demo

Details

  • ((?:negociated|original) discount) - Group 1: either negociated or original and then a discount word
  • .*? - any 0+ chars other than line break chars, as few as possible
  • \bUSD - a whole word USD
  • \s* - 0+ whitespaces
  • (\d+(?:\.\d+)?) - Group 2: 1+ digits followed with an optional . and 1+ digits sequence

In R stringr, you may extract these values using

x <- "original discount of 9.17 % (amount with discount: USD 36.12) and negociated discount of 36.12 % (amount with discount: USD 25.40), delivery in 15 days"
res <- stringr::str_match_all(x, "((?:negociated|original) discount).*?\\bUSD\\s*(\\d+(?:\\.\\d+)?)")
lapply(res, function(z) z[,-1])

See an R online demo

Upvotes: 1

Related Questions