user1236048
user1236048

Reputation: 5602

How do you create a string to match an regex?

I need to create a formatting documentation. I know the regex that are used to format the text but I don't know how to reproduce an example for that regex. This one should be an internal link:

'{\[((?:\#|/)[^ ]*) ([^]]*)\]}'

Can anyone create an example that would match this, and maybe explain how he got it. I got stuck at '?'.

I never used this meta-character at the beginning, usually I use it to mark that an literal cannot appear or appear exactly once.

Thanks

Upvotes: 5

Views: 382

Answers (3)

user557597
user557597

Reputation:

I think this is a good post to help design regex. While its fairly easy to write a
general regex to match a string, sometimes its helpfull to look at it in reverse after
its designed. Sometimes it is necessary to see what bizzar things will match.

When mixing a lot of the metachars as literals, its fairly important to format
these kind for ease of reading and to avoid errors.

Here are some samples in Perl which were easier (for me) to prototype.

my @samps = (
 '{[/abcd asdfasefasdc]}',
 '{[# ]}',
 '{[# /# \/]}',
 '{[/#  {[
    | /# {[#\/} ]}',
,
);

for (@samps) {
   if (m~{\[([#/][^ ]*) ([^]]*)\]}~)
   {
      print "Found: '$&'\ngrp1 = '$1'\ngrp2 = '$2'\n===========\n\n";
   }
}

__END__

Expanded

\{\[ 
  (
     [#/][^ ]*
  )
  [ ]
  (
     [^\]]*
  )
\]\}

Output

Found: '{[/abcd asdfasefasdc]}'
grp1 = '/abcd'
grp2 = 'asdfasefasdc'
===========

Found: '{[# ]}'
grp1 = '#'
grp2 = ''
===========

Found: '{[# /# \/]}'
grp1 = '#'
grp2 = '/# \/'
===========

Found: '{[/#    {[
        | /# {[#\/}     ]}'
grp1 = '/#      {[
        |'
grp2 = '/# {[#\/}       '
===========

Upvotes: 1

mario
mario

Reputation: 145482

See Open source RegexBuddy alternatives and Online regex testing for some helpful tools. It's easiest to have a regex explained by them first. I used YAPE here:

NODE                     EXPLANATION
----------------------------------------------------------------------
  \[                       '['
----------------------------------------------------------------------
  (                        group and capture to \1:
----------------------------------------------------------------------
    (?:                      group, but do not capture:
----------------------------------------------------------------------
      \#                       '#'
----------------------------------------------------------------------
     |                        OR
----------------------------------------------------------------------
      /                        '/'
----------------------------------------------------------------------
    )                        end of grouping
----------------------------------------------------------------------
    [^ ]*                    any character except: ' ' (0 or more
                             times (matching the most amount
                             possible))
----------------------------------------------------------------------
  )                        end of \1
----------------------------------------------------------------------
                           ' '
----------------------------------------------------------------------
  (                        group and capture to \2:
----------------------------------------------------------------------
    [^]]*                    any character except: ']' (0 or more
                             times (matching the most amount
                             possible))
----------------------------------------------------------------------
  )                        end of \2
----------------------------------------------------------------------
  \]                       ']'
----------------------------------------------------------------------

This is under the presumption that { and } in your example are the regex delimiters.

You can just read through the list of explanations and come up with a possible source string such as:

 [#NOSPACE NOBRACKET]

Upvotes: 3

ruakh
ruakh

Reputation: 183381

(?:...) has the same grouping effect as (...), but without "capturing" the contents of the group; see http://php.net/manual/en/regexp.reference.subpatterns.php.

So, (?:\#|/) means "either # or /".

I'm guessing you know that [^ ]* means "zero or more characters that aren't SP", and that [^]]* means "zero or more characters that aren't right-square-brackets".

Putting it together, one possible string is this:

'{[/abcd asdfasefasdc]}'

Upvotes: 3

Related Questions