DigiLive
DigiLive

Reputation: 1103

Regex trouble with capturing repeated pattern

Lets say I have the following string:

Some crap string here...(TRACK "title1" F (S #h88 (P #m6) (P #m31)) (S #k3 (P #m58) (P #m58)))(TRACK "title2" P (S #a54 (P #r8)) (S #v59 (P #a25) (P #y82)))...Some other crap string here

Out of this string I need to extract to following data:

  1. title1
  2. F
  3. (S #h88 (P #m6) (P #m31)) and (S #k3 (P #m58) (P #m58))

and

  1. title2
  2. P
  3. (S #a54 (P #r8)) and (S #v59 (P #a25) (P #y82))

where

  1. is some kind of title.
  2. is some kind of status.
  3. is some kind of list of lists, like (S #xx (P #xx)).

Having limited regex knowledge, I can get 1 and 2, but only get the first part of 3.
(S #xx (P #xx)) can exist multiple times and also the inner (P #xx) can exist multiple times.

I've tried many regex expression and consulted a lot of posts, but I keep having troubles getting the data out as requested.

So now I'm back at \(TRACK "(.*?)" ([P|F]) (\(S.*?\)\)) which only captures the first of two lists in this example string.

see: https://regex101.com/r/FM0ZZR/1

What do I need to do to get all lists as described?

Upvotes: 3

Views: 48

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626845

You can use

\(TRACK\s+"([^"]*)"\s+([PF])((?:\s+(\([SP](?:[^()]*+|(?-1))*\)))*\))

See the regex demo.

Details

  • \(TRACK - a (TRACK substring
  • \s+ - one or more whitespaces
  • " - a " char
  • ([^"]*) - Group 1: any zero or more chars other than "
  • " - a " char
  • \s+ - one or more whitespaces
  • ([PF]) - Group 2: P or F
  • ((?:\s+(\([SP](?:[^()]*+|(?-1))*\)))*\)) - Group 3:
    • (?:\s+(\([SP](?:[^()]*+|(?-1))*\)))* - zero or more repetitions of
      • \s+ - one or more whitespaces
      • (\([SP](?:[^()]*+|(?-1))*\)) - Group 4 (technical, necessary for recursion):
        • \( - a ( char
        • [SP] - S or P
        • (?:[^()]*+|(?-1))* - zero or more chars other than ( and ) or the whole most recently captured pattern
        • \) - a ) char
    • \) - a ) char.

Upvotes: 2

Related Questions