Reputation: 3099
I'm looking for elegant regular expression to clean brackets with content looks like file name.
[Nibh justo] elit Nulla [link.pdf] auctor ipsum molestie (link.pdf)
Condimentum euismod non [link.xls](link.xls) [link.doc](link.doc) tempus
In [Curabitur] et
The result should be:
Nibh justo elit Nulla auctor ipsum molestie Condimentum euismod
non tempus In Curabitur et
I trust there must be a short way do that. (File means simply - dot included. No sentences check is necessary.)
thank for help
Upvotes: 3
Views: 532
Reputation: 11181
Try this:
(?:[\[\(]\w+\.\w+[\]\)])|(?:[\[\(](?=[0-9A-Za-z]))|(?:(?<=[0-9A-Za-z])[\]\)])
$result = preg_replace('/(?:[[(]\w+\.\w+[\])])|(?:[[(](?=[0-9A-Za-z]))|(?:(?<=[0-9A-Za-z])[\])])/m', '', $subject);
Explanation:
<!--
(?:[\[\(]\w+\.\w+[\]\)])|(?:[\[\(](?=[0-9A-Za-z]))|(?:(?<=[0-9A-Za-z])[\]\)])
Options: ^ and $ match at line breaks
Match either the regular expression below (attempting the next alternative only if this one fails) «(?:[\[\(]\w+\.\w+[\]\)])»
Match the regular expression below «(?:[\[\(]\w+\.\w+[\]\)])»
Match a single character present in the list below «[\[\(]»
A [ character «\[»
A ( character «\(»
Match a single character that is a “word character” (letters, digits, and underscores) «\w+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Match the character “.” literally «\.»
Match a single character that is a “word character” (letters, digits, and underscores) «\w+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Match a single character present in the list below «[\]\)]»
A ] character «\]»
A ) character «\)»
Or match regular expression number 2 below (attempting the next alternative only if this one fails) «(?:[\[\(](?=[0-9A-Za-z]))»
Match the regular expression below «(?:[\[\(](?=[0-9A-Za-z]))»
Match a single character present in the list below «[\[\(]»
A [ character «\[»
A ( character «\(»
Assert that the regex below can be matched, starting at this position (positive lookahead) «(?=[0-9A-Za-z])»
Match a single character present in the list below «[0-9A-Za-z]»
A character in the range between “0” and “9” «0-9»
A character in the range between “A” and “Z” «A-Z»
A character in the range between “a” and “z” «a-z»
Or match regular expression number 3 below (the entire match attempt fails if this one fails to match) «(?:(?<=[0-9A-Za-z])[\]\)])»
Match the regular expression below «(?:(?<=[0-9A-Za-z])[\]\)])»
Assert that the regex below can be matched, with the match ending at this position (positive lookbehind) «(?<=[0-9A-Za-z])»
Match a single character present in the list below «[0-9A-Za-z]»
A character in the range between “0” and “9” «0-9»
A character in the range between “A” and “Z” «A-Z»
A character in the range between “a” and “z” «a-z»
Match a single character present in the list below «[\]\)]»
A ] character «\]»
A ) character «\)»
-->
when the above RegEx applied to :
[Nibh justo] elit Nulla [link.pdf] auctor ipsum molestie (link.pdf)
Condimentum euismod non [link.xls](link.xls) [link.doc](link.doc) tempus
In [Curabitur] et
produces required result:
Nibh justo elit Nulla auctor ipsum molestie
Condimentum euismod non tempus
In Curabitur et
Upvotes: 2
Reputation: 1296
Something like this ?
$str = '[Nibh justo] elit Nulla [link.pdf] auctor ipsum molestie (link.pdf)
Condimentum euismod non [link.xls](link.xls) [link.doc](link.doc) tempus In
[Curabitur] and other [./beta/link.pfd]';
$str = preg_replace('`(\(|\[)[\w/\.-]+\.[a-z]+(\)|\])`i', '', $str);
$str = str_replace(array('[', ']'), '', $str);
echo $str;
Result is :
Nibh justo elit Nulla auctor ipsum molestie Condimentum euismod
non tempus In Curabitur and other
Upvotes: 3