whitesiroi
whitesiroi

Reputation: 2843

How to remove all tags with text in it PHP

How to remove all tags with text in it PHP

I read other SO answers but it didn't work as expected. I have tried /<[^>]*>/ and other reg expression but wasn't able to make it work. And strip_tags only deletes tags without text in it.

Here is the example I have : http://www.regexr.com/3dmif

How to delete tags that are in tag? Like:

<a>test</a> hello mate <p> test2 <a> test3 </a></p>

The output should be : hello mate

Upvotes: 2

Views: 118

Answers (2)

Mario
Mario

Reputation: 3379

Getting your result using regular expression will be really hard because it would need to understand the html scope, which regex can't so using it would be a really bad solution.

A simple solution to your problem would be simply to parse the html and get only the text nodes on the first dimension.

This code snippet solves your given problem but you will have to extend / change it depending on your needs.

<?php 
// creates a new dom document with your html
// contents
$dom = new DOMDocument;
$dom->loadHTML("<a>test</a> hello mate <p> test2 <a> test3 </a></p>");

// always use the body element
$body = $dom->getElementsByTagName('body')->item(0);

// prepare your  text
$text = '';

// itarete over all items on the first dimension
// and check if they are a text node:
foreach($body->childNodes as $node)
{
    if ($node->nodeName === '#text')
    {
        $text .= $node->nodeValue;
    }
}

var_dump($text); // hello mate

Cheers.

Edit:

As @splash58 pointed out, your can also use xpath to access the text nodes directly.

<?php 
// creates a new dom document with your html
// contents
$dom = new DOMDocument;
$dom->loadHTML("<a>test</a> hello mate <p> test2 <a> test3 </a></p>");
$xpath = new DOMXpath($dom);

$text = '';

foreach ($xpath->query("/html/body/text()") as $node) 
{
    $text .= $node->nodeValue;
}

var_dump($text); // hello mate

Upvotes: 3

vishal
vishal

Reputation: 59

This code snippet solves your given problem. it will helpful to you.

<?php

$title = "<a>test</a> hello mate <p> test2 <a> test3 </a></p>";

$result = preg_replace("(<([a-z]+)>.*?</\\1>)is","",$title);
echo $result;   // hello mate

?>

Upvotes: 1

Related Questions