CLECode
CLECode

Reputation: 158

PHP - Regex match curly brackets within other regex expression

I am trying to figure out how to match other parts of the stuff I need but can't seem to get it to work.

This is what I have so far:

preg_match_all("/^(.*?)(?:.\(([\d]+?)[\/I^\(]*?\))(?:.\((.*?)\))?/m",$data,$r, PREG_SET_ORDER);

Example text:

INPUT - Each line represents a line inside a text file. 
-------------------------------------------------------------------------------------
"!?Text" (1234)                                         1234-4321
"#1 Text" (1234)                                        1234-????
#2 Text (1234) {Some text (#1.1)}                       1234
Text (1234)                                             1234
Some Other Text: More Text here 1234-4321 (1234) (V)    1234

What I want to do:

I want to also match things in curly brackets and stuff in brackets of curly brackets. I can't seem to get it to work considering that things in curly brackets + brackets may not always be within the line.

Essentially first (1234) will be a year and I only want to match it once, however in the last string example it also matches (V) but I don't want it to.

Desirable output:

Array
(
    [0] => "!?Text" (1234)
    [1] => "!?Text"
    [2] => 1234
)
Array
(
    [0] => "#1 Text" (1234)
    [1] => "#1 Text"
    [2] => 1234
)
Array
(
    [0] => "#2 Text" (1234)
    [1] => "#2 Text"
    [2] => 1234
    [3] => Some text (#1.1) // Matches things within curly brackets if there are any.
    [4] => Some text // Extracts text before brackets
    [5] => #1.1 // Extracts text within brackets (if any because brackets may not be within curly brackets.)
)
Array
(
    [0] => Text (1234)
    [1] => Text
    [2] => 1234
)
Array // (My current regular expression gives me a 4th match with value 'V', which it shouldn't do)
(
    [0] => Some Other Text: More Text here 1234-4321 (1234) (V)
    [1] => Some Other Text: More Text here 1234-4321
    [2] => 1234
)

Upvotes: 1

Views: 552

Answers (1)

Enissay
Enissay

Reputation: 4953

What about using:

^((.*?) *\((\d+)\))(?: *\{((.*?) *\((.+?)\)) *\})?

DEMO

  NODE                       EXPLANATION
--------------------------------------------------------------------------------
  ^                        the beginning of the string
--------------------------------------------------------------------------------
  (                        group and capture to \1:
--------------------------------------------------------------------------------
    (                        group and capture to \2:
--------------------------------------------------------------------------------
      .*?                      any character except \n (0 or more
                               times (matching the least amount
                               possible))
--------------------------------------------------------------------------------
    )                        end of \2
--------------------------------------------------------------------------------
     *                       ' ' (0 or more times (matching the most
                             amount possible))
--------------------------------------------------------------------------------
    \(                       '('
--------------------------------------------------------------------------------
    (                        group and capture to \3:
--------------------------------------------------------------------------------
      \d                       digits (0-9)
--------------------------------------------------------------------------------
                               ' '
--------------------------------------------------------------------------------
    )                        end of \3
--------------------------------------------------------------------------------
    \)                       ')'
--------------------------------------------------------------------------------
  )                        end of \1
--------------------------------------------------------------------------------
  (?:                      group, but do not capture (optional
                           (matching the most amount possible)):
--------------------------------------------------------------------------------
     *                       ' ' (0 or more times (matching the most
                             amount possible))
--------------------------------------------------------------------------------
    \{                       '{'
--------------------------------------------------------------------------------
    (                        group and capture to \4:
--------------------------------------------------------------------------------
      (                        group and capture to \5:
--------------------------------------------------------------------------------
        .*?                      any character except \n (0 or more
                                 times (matching the least amount
                                 possible))
--------------------------------------------------------------------------------
      )                        end of \5
--------------------------------------------------------------------------------
       *                       ' ' (0 or more times (matching the
                               most amount possible))
--------------------------------------------------------------------------------
      \(                       '('
--------------------------------------------------------------------------------
      (                        group and capture to \6:
--------------------------------------------------------------------------------
        .                        any character except \n
--------------------------------------------------------------------------------
         ?                       ' ' (optional (matching the most
                                 amount possible))
--------------------------------------------------------------------------------
      )                        end of \6
--------------------------------------------------------------------------------
      \)                       ')'
--------------------------------------------------------------------------------
    )                        end of \4
--------------------------------------------------------------------------------
     *                       ' ' (0 or more times (matching the most
                             amount possible))
--------------------------------------------------------------------------------
    \}                       '}'
--------------------------------------------------------------------------------
  )?                       end of grouping

Upvotes: 1

Related Questions