Reputation: 1363
I need to gather the two values 36.12
and 25.40
in the following string:
original discount of 9.17 % (amount with discount: USD 36.12) and negociated discount of 36.12 % (amount with discount: USD 25.40), delivery in 15 days
Observe that both quantities are preceeded by the same char string amount with discount: USD
, the labels for the desired values are original discount
and negociated discount
.
For the first desired value I tried (?<=original discount of ).*\)
that correctly capture 9.17 % (amount with discount: USD 36.12)
, then appended ((?<=amount with discount: USD).*)
(resulting in full regex (?<=original discount of ).*\)((?<=amount with discount: USD).*)
) to capture the 36.12, but it does not work (tried the same for the second desired value, changing original
to negotiated
).
Any hints on this? Is there an easier way?
Upvotes: 2
Views: 54
Reputation: 626926
You may capture both parts you need:
((?:negociated|original) discount).*?\bUSD\s*(\d+(?:\.\d+)?)
See the regex demo
Details
((?:negociated|original) discount)
- Group 1: either negociated
or original
and then a discount
word.*?
- any 0+ chars other than line break chars, as few as possible\bUSD
- a whole word USD
\s*
- 0+ whitespaces(\d+(?:\.\d+)?)
- Group 2: 1+ digits followed with an optional .
and 1+ digits sequenceIn R stringr
, you may extract these values using
x <- "original discount of 9.17 % (amount with discount: USD 36.12) and negociated discount of 36.12 % (amount with discount: USD 25.40), delivery in 15 days"
res <- stringr::str_match_all(x, "((?:negociated|original) discount).*?\\bUSD\\s*(\\d+(?:\\.\\d+)?)")
lapply(res, function(z) z[,-1])
See an R online demo
Upvotes: 1