Reputation: 5550
I will have a string (one line) composed by a HTML code that will be stored in a PHP variable. This string comes from a HTML page that normally has new line and white spaces between tags. We can have new line (one or more) and, or white space like this exemle:
<h1>tag1</h>
<p>Between h ad p we have \s and \n</p>
After perform a regex and preg_replace I would like to have this:
<h1>tag1</h><p>Between h ad p we have \s and \n</p>
I have tried this regex but it is not workig.
$str=<<<EOF
<h1>tag1</h>
<p>Between h ad p we have \s and \n</p>
EOF;
$string = trim(preg_replace('/(>\s+<)|(>\n+<)/', ' ', $str));
Here you can find the entire code http://www.phpliveregex.com/p/7Pn
Upvotes: 2
Views: 3098
Reputation: 26667
There are two problems with
(preg_replace('/(>\s+<)|(>\n+<)/', ' ', $str)
\s
already includes \n
hence there is no need to provide another alternation.
(>\s+<)
here the regex consumes both the angulars <
and >
hence replacing with space would remove everything including the angulars
The output is
<h1>tag1</hp>Between h ad p we have \s and \n</p>
which is not what you want
How to correct
use the regex (>\s+<)
and replacement string as ><
giving output as
<h1>tag1</h><p>Between h ad p we have \s and \n</p>
for example http://regex101.com/r/dI1cP2/2
you can also use lookaround to solve the issue
the regex would be
(?<=>)\s+(?=<)
and replace string would be empty string
Explanation
(?<=>)
asserts that \s
is presceded by >
\s+
matches one or more space
(?=<)
asserts the \s
is followed by <
Here the look arounds will not consume any angular brackets as in the earlier regex
see http://regex101.com/r/dI1cP2/3 for example
Upvotes: 5
Reputation: 67968
(?<=<\/h>)\s+
Try this.See demo.Replace by empty string
http://regex101.com/r/jI8lV7/1
Upvotes: 0
Reputation: 9782
You can try with this:
echo preg_replace("/(?=\>\s+\n|\n)+(\s+)/", "", $str);
Upvotes: 1