Fizzix
Fizzix

Reputation: 24385

How to remove text from a string that is not surrounded by HTML tags?

So basically I have a large sting (few paragraphs long).

I need to remove all text from this string that is not surrounded by any HTML tags.

For example, this string:

<h1>This is the title</h1>This is a bit of text with no HTML around it<p>This is within a paragraph tag</p>

Should be converted to:

<h1>This is the title</h1><p>This is within a paragraph tag</p>

I believe this is best done with regex, although I am not very familiar with it's synax.

All help is greatly appreciated.


This is what I ended up using:

<?php
$string = '<h1>This is the title</h1>This is a bit of text with no HTML around it<p>This is within a paragraph tag</p>';
$pattern = '/(<\/[^>]+>)[^<]*(<[^>]+>)/';
$replacement = '$1$2';
echo preg_replace($pattern, $replacement, $string);
?>

Upvotes: 1

Views: 757

Answers (1)

alpha bravo
alpha bravo

Reputation: 7948

you could use this regex (<\/[^>]+>)[^<]*(<[^>]+>) and replace with $1$2 live demo

Upvotes: 3

Related Questions