Reputation:
I'm not sure on this and think this is impossible, but I thought I'd ask anyway.
I would like to use a regex delimeter that is a metachar. Example would be
brackets, parenthesis, etc.. [ ], ( ), ...
but anything really.
Its not that I need to do it, its that I'm trying to write an escaping routine as part of a project.
So, whats the problem? The problem comes in the regex body when its not really a metachar
its a literal, like:
/ \( \) /
where the forward slash delimeters are to be replaced with (
and )
In Perl for instance, these won't work
=~ m( \( \) )
=~ m( \\( \\) )
=~ m( \\\( \\\) )
=~ m( \\\\( \\\\) )
No amount of escaping the parenthesis will alow a single backslash, ie a literal \(
The backslash on the delimeter is always removed, the remainder of backslashes are then subject to normal quoting rules. This always results in an even number of backslashes.
PHP is apparently the same way.
Like I said, I wouldn't use meta characters as delimeters in normal operation, this
is just a utility I'm tring to write (which seems in jepardy right now).
I'm trying to use just basic escaping rules and want avoid having to scan the string
ahead of time comparing selected delimeters for literal (escaped) meta characters in
the regex text body.
Perl uses q()
and qq()
that does this correctly (not qr() unfortunately).
It does this by removing escapes on escapes and escapes on delimeters at the same time.
So q( \\\( \\\) )
results in \( \)
.
Thanks for any help.
Edit
After some research I found this to be impossible, so utility is scrapped.
Thanks for the valuable input though. I'm fairly impressed with Perl's array of
quoting options, especially 'quote like operators' which does the job
but the delimeter is then really for the quote operator and not a regex.
Upvotes: 0
Views: 243
Reputation: 386386
[ I'm not sure if you're asking about Perl or PHP. I just know about Perl ]
Regex literals are parsed twice, once by the Perl compiler and once by the regex compiler.
The Perl parser finds the end of the literal while handling interpolation, escaped delimiters and sequences like \Q
and \L
. This produces the regex pattern (as a string) and the matching options (e.g. case-insensitive matching).
qr/\/\(/
produces the pattern /\(
(/
got unescaped). Similarly,
qr(\/\()
produces the pattern \/(
((
got unescaped).
The regex compiler takes the regex patter and the matching options and returns a compiled regex.
/\(
produces a regex that matches exactly /(
, while
\/(
produces a regex syntax error.
To produce a regex that matches exactly (
, you would need to produce the pattern \(
or equivalent. Here are your options:
qr/\(/
(Don't use it as a delimiter)$d='('; qr(\Q$d\E)
(Don't use it in the literal)qr(\Q\(\E)
(Use \Q
to insert an escape after \(
has become (
)qr(\x28)
(Use something equivalent)qr([\(])
(Use it in a way that doesn't require it being escaped)You best option by far is to simply choose a different delimiter: One that isn't a meta char, or one that's not used in the pattern. This is trivial since it only matters for hardcoded patterns.
Upvotes: 4
Reputation: 31
Can you develop your example a bit more precise?
Because
If the original string -> '\('
then /[\\][(]/
will match it
Upvotes: 2
Reputation: 241988
I do not know about PHP, but you can use the \Q
in Perl:
"()" =~ m(\Q\(\)\E) and print "YES\n"
Using one-member character classes should work in both Perl and PHP:
"()" =~ m([(][)]) and print "YES\n"
Upvotes: 2