Reputation: 15176
First part of question: p tag
I have a string that contains text with unnecessary line breaks caused by p tags, example:
<p>hi everyone,</p>
<p> </p>
<p> </p>
<p> </p>
<p>Here comes the content I wanted to write...</p>
I would like to filter these empty p tags and merge them into one:
<p>hi everyone,</p>
<p> </p>
<p>Here comes the content I wanted to write...</p>
How can this be done?
Thank you!
Second part of question: br tag
Sometimes the string contains br tags that are causing line breaks as well, example:
that is all I wanted to write.<br />
<br />
<br />
<br />
<br />
<br />
bye
This should become:
that is all I wanted to write.<br />
<br />
bye
Upvotes: 1
Views: 2311
Reputation: 9302
try using str_replace
$content = str_replace(array("<p> </p>\n", " <br />\n"), array('', ''), $content);
To use regex:
$content = preg_replace('/((<p\s*\/?>\s*) (<\/p\s*\/?>\s*))+/im', "<p> </p>\n", $content);
and for BRs
$content = preg_replace('/( (<br\s*\/?>\s*)|(<br\s*\/?>\s*))+/im', "<br />\n", $content);
EDIT Heres why your regex works (hopefully so you can understand it a bit :) ):
/((\\n\s*))+/im
^ ^^^ ^^ ^^^^
| \|/ || ||\|
| | || || -- Flags
| | || |-- Regex End Character
| | || -- One or more of the preceeding character(s)
| | |-- Zero or More of the preceeding character(s)
| | -- String Character
| -- Newline Character (Escaped)
-- Regex Start Character
Every regex expression must start and end with the same character. In this case, i've used the forward slash character.
The ( character indicates an expression block (to replace)
The Newline character is \n
. Because the backslash is used as the escape character in regex, you will need to escape it: \\n
.
The string character is \s
. This will search for a string. The *
character means to search for 0 or more of the preceeding expression, in this case, search for zero or more strings: \s*
.
The + symbols searches for ONE or more of the preceeding expresssion. In this case, the preceeding expression is (\\n\s*)
, so as long as that expression is found once or more, the preg_replace function will find something.
The flags I've used i
and m
means case *I*nsensitive, (not really needed for a newline expression), and *M*ultiline - meaning the expression can go over multiple lines of code, instead of the code needing to be on one line.
Upvotes: 3