mitas
mitas

Reputation: 83

Regex to obtain two values from string

I have string like below and i need to pull out two values (numeric values) one is 197kJ (numeric 197) and second is 47kcal (numeric 47). Can someone help me with this because I just go crazy :) ?

My regular expression:

((<|>)?\d+((\.|,)\d+)?kj\s?\/\s?)?(<|>)?(\d+((\.|,)\d+)?)kcal

String to search in:

Per 250ml serving (10 servings per pack): Energy 197kJ (2% ADH)/47kcal (2% ADH), Fat 0.3g (of which Saturated Fat 0.1g), Carbohydrate 7.8g (3% ADH) (of which Sugars 3.9g (4% ADH)), Fibres 1.6g, Protein 2.2g (4% ADH), Salt 1.6g (27% ADH)

Upvotes: 2

Views: 77

Answers (4)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626728

You may use alternation and optional (0 or more) spaces between the number and measurement unit:

[<>]?(\d+(?:[.,]\d+)?)\s*k(?:cal|j)

To be used with the i case insensitive modifier.

See the regex demo.

Details:

  • [<>]? - an optional < or >
  • (\d+(?:[.,]\d+)?) - Group 1 capturing 1 or more digits, and then an optional sequence: a . or , and 1+ digits
  • \s* - zero or more whitespaces
  • k - literal k
  • (?:cal|j) - either a cal or j.

Upvotes: 0

Tryph
Tryph

Reputation: 6209

here is my proposition:

(\d+)kJ.*?(\d+)kcal
  1. it extracts the kJ and the kcal number in two differents capturing groups
  2. it simply captures all numeric chars before "kJ" or "kcal" substring
  3. it uses lazy quantifier *? to avoid consuming the first chars of the kcal number.

https://regex101.com/r/uS3mE4/4

Upvotes: 0

Andrey Korneyev
Andrey Korneyev

Reputation: 26846

Your regex pattern looks overcomplicated. Probably it is because it serves more complex job than described.

But your taks (get numeric values of kJ and kcal) can be done using pattern like:

(\d+[.,]?\d+)(?:kJ|kcal)

Upvotes: 1

Jan
Jan

Reputation: 43169

Just do:

\d+(?:kcal|kJ)
# require at least one number
# followed by either kcal or kJ

See a demo on regex101.com (or yours: https://regex101.com/r/uS3mE4/3)

Upvotes: 1

Related Questions