Reputation: 83
I have string like below and i need to pull out two values (numeric values) one is 197kJ (numeric 197) and second is 47kcal (numeric 47). Can someone help me with this because I just go crazy :) ?
((<|>)?\d+((\.|,)\d+)?kj\s?\/\s?)?(<|>)?(\d+((\.|,)\d+)?)kcal
String to search in:
Per 250ml serving (10 servings per pack): Energy 197kJ (2% ADH)/47kcal (2% ADH), Fat 0.3g (of which Saturated Fat 0.1g), Carbohydrate 7.8g (3% ADH) (of which Sugars 3.9g (4% ADH)), Fibres 1.6g, Protein 2.2g (4% ADH), Salt 1.6g (27% ADH)
Upvotes: 2
Views: 77
Reputation: 626728
You may use alternation and optional (0 or more) spaces between the number and measurement unit:
[<>]?(\d+(?:[.,]\d+)?)\s*k(?:cal|j)
To be used with the i
case insensitive modifier.
See the regex demo.
Details:
[<>]?
- an optional <
or >
(\d+(?:[.,]\d+)?)
- Group 1 capturing 1 or more digits, and then an optional sequence: a .
or ,
and 1+ digits\s*
- zero or more whitespacesk
- literal k
(?:cal|j)
- either a cal
or j
.Upvotes: 0
Reputation: 6209
here is my proposition:
(\d+)kJ.*?(\d+)kcal
*?
to avoid consuming the first
chars of the kcal number.https://regex101.com/r/uS3mE4/4
Upvotes: 0
Reputation: 26846
Your regex pattern looks overcomplicated. Probably it is because it serves more complex job than described.
But your taks (get numeric values of kJ and kcal) can be done using pattern like:
(\d+[.,]?\d+)(?:kJ|kcal)
Upvotes: 1
Reputation: 43169
Just do:
\d+(?:kcal|kJ)
# require at least one number
# followed by either kcal or kJ
See a demo on regex101.com (or yours: https://regex101.com/r/uS3mE4/3)
Upvotes: 1