matthewpavkov
matthewpavkov

Reputation: 2928

What is the correct regex (for PHP preg_replace) to remove empty paragraph ( <p> ) tags?

I'm working in Wordpress and need to be able to remove images and empty paragraphs. So far, I've found out how to remove images without a problem. But, I then need to remove empty paragraph tags. I'm using PHP preg_replace to handle the regex functions.

So, as an example, I have the string:

<p style="text-align:center;"><img src="http://www.blah.com/image.jpg" alt="Blah Image" /></p><p>Some text</p>

I run this regex on it:

/<img.*?(>)/

And I end up with this string:

<p style="text-align:center;"></p><p>Some text</p>

I then need to be able to remove the empty paragraph. I tried this, but it removes all paragraphs and the contents of the paragraphs:

/<p[^>]*><\/p[^>]*>/

Any help/suggestions is greatly appreciated!

Upvotes: 0

Views: 2261

Answers (2)

Dagg Nabbit
Dagg Nabbit

Reputation: 76756

/<p[^>]*><\/p[^>]*>/ (the regex you gave) should work fine. If it's giving you trouble you could try double-escaping the / like this: /<p[^>]*><\\/p[^>]*>/

PHP is funny about quoting and escape characters. For example "\n" is not equal to '\n'. The first is a line break, the second is a literal backslash followed by an 'n'. The PHP manual entry on string literals is probably worth a quick look.

Upvotes: 0

webbiedave
webbiedave

Reputation: 48887

The correct regex is no regex. Use an HTML/DOM Parser instead. They're simple to use. Regex is for regular languages (which HTML is not).

Upvotes: 3

Related Questions