netbrain
netbrain

Reputation: 9304

What does this regex match? php regex from mediawiki sourcecode

This is a regex from a mediawiki, an open source wiki solution.

/\[((http\:\/\/|https\:\/\/|ftp\:\/\/|irc\:\/\/|ircs\:\/\/|gopher\:\/\/|telnet\:\/\/|nntp\:\/\/|worldwind\:\/\/|mailto\:|news\:|svn\:\/\/|git\:\/\/|mms\:\/\/|\/\/)[^][<>"\x00-\x20\x7F\p{Zs}]+)\p{Zs}*([^\]\x00-\x08\x0a-\x1F]*?)\]/Su

To me it seems like it matches uri's, but i can't get it to match anything. And im having trouble understanding the last part of the regex, namely.

[^][<>"\x00-\x20\x7F\p{Zs}]+)\p{Zs}*([^\]\x00-\x08\x0a-\x1F]*?)\]

what the heck does this do?

Any help on decoding this is greatly appreciated.

Upvotes: 2

Views: 268

Answers (2)

stema
stema

Reputation: 92986

[^][<>"\x00-\x20\x7F\p{Zs}]
Is a negated character class that matches any character but: ][<>", the ASCII character range \x00-\x20, the ASCII character \x7F and whitespace (p{Zs} is a Unicode Character Property that matches any kind of spaces character)

\p{Zs}* matches any kind of spaces character 0 or more times

[^\]\x00-\x08\x0a-\x1F]
Is a negated character class that matches any character but ], the ASCII character ranges \x00-\x08 and \x0a-\x1F

Upvotes: 3

splash
splash

Reputation: 13327

This regex matches external links like

[http://www.stackoverflow.com]
[https://www.stackoverflow.com StackOverflow]
[ftp://ftp.mozilla.org Mozilla]

Upvotes: 4

Related Questions