Reputation: 33
I have regular expression like this:
s/<(?:[^>'"]|(['"]).?\1)*>//gs
and I don't know what exactly does it mean.
Upvotes: 0
Views: 90
Reputation: 13725
This tool can explain the details: http://rick.measham.id.au/paste/explain.pl?regex=%3C%28%3F%3A[^%3E%27%22]|%28[%27%22]%29.%3F\1%29*%3E
NODE EXPLANATION
--------------------------------------------------------------------------------
< '<'
--------------------------------------------------------------------------------
(?: group, but do not capture (0 or more times
(matching the most amount possible)):
--------------------------------------------------------------------------------
[^>'"] any character except: '>', ''', '"'
--------------------------------------------------------------------------------
| OR
--------------------------------------------------------------------------------
( group and capture to \1:
--------------------------------------------------------------------------------
['"] any character of: ''', '"'
--------------------------------------------------------------------------------
) end of \1
--------------------------------------------------------------------------------
.? any character except \n (optional
(matching the most amount possible))
--------------------------------------------------------------------------------
\1 what was matched by capture \1
--------------------------------------------------------------------------------
)* end of grouping
--------------------------------------------------------------------------------
> '>'
So it tries to remove HTML tags as ysth also mentions.
Upvotes: 0
Reputation: 98388
The regex looks intended to remove HTML tags from input.
It matches text beginning with <
and ending with >
, containing non->
/non-quotes or quoted strings (which may contain >
). But it appears to have an error:
The .?
says that quotes may contain 0 or 1 character; it was probably intended to be .*?
(0 or more characters). And to prevent backtracking from doing things like making the .
match a quote in some odd cases, it needs to change the (?: ... )
grouping to be possessive (>
instead of :
).
Upvotes: 1