Araz Jafaripur
Araz Jafaripur

Reputation: 906

Replace img tag with the title attribute

I have an HTML string with the following content:

<p>your name :
<img title="##name##" src="name.jpg"/></p>
<p>your lastname:
<img title="##lastname##" src="lastname.jpg"/></p>
<p>your email :
<img title="##email##" src="email.jpg"/></p>
<p>submit
<img title="submit" src="submit.jpg"/></p>

Now I want to extract all the title attributes (they appear inside a pair of ## tags), and remove the <img> tag and replace it with the extracted title.

The result should look like this:

<p>your name :
##name##</p>
<p>your lastname:
##lastname##</p>
<p>your email :
##email##</p>
<p>submit
<img title="submit" src="submit.jpg" title="submit"/></p>

What's the best way to do this?

Upvotes: 0

Views: 282

Answers (4)

Manu
Manu

Reputation: 931

Try this

$content = preg_replace('/<img.*?(##.+##).*?\/>/', '$1', $content);

Upvotes: 1

Adesh Pandey
Adesh Pandey

Reputation: 769

I think you can give a try to this :

$content = preg_replace('/<img.*?(##.+##).*?\/>/','${1}', $content);
$content = str_replace('##','',$content);

Upvotes: 1

Amal Murali
Amal Murali

Reputation: 76646

Use an HTML parser to achieve this task. Here's a solution using the built-in DOMDocument class:

$dom = new DOMDocument;
libxml_use_internal_errors(true);
$dom->loadHTML($html);


$tags = $dom->getElementsByTagName('img');
$length = $tags->length;

for ($i=$length-1; $i>=0; $i--) {
    $tag = $tags->item($i);
    $title = $tag->getAttribute('title');

    // check if title is of the format '##...##'
    if (preg_match('/##\w+?##/', $title)) {
        $textNode = $dom->createTextNode($title);
        $tag->parentNode->replaceChild($textNode, $tag);
    }
}

$html = preg_replace(
    '~<(?:!DOCTYPE|/?(?:html|head|body))[^>]*>\s*~i', '', 
    $dom->saveHTML()
);
echo $html;

Output:

<p>your name :
##name##</p>
<p>your lastname:
##lastname##</p>
<p>your email :
##email##</p>
<p>submit
<img title="submit" src="submit.jpg"></p>

Demo

Upvotes: 1

winkbrace
winkbrace

Reputation: 2711

So first you want to select any area that: starts with "<img", then contains "##", then 1 or more characters, then "##", and ends with ">"

Then in that extracted block you want to find the part that starts with "##", then 1 or more characters, then ends with "##".

By writing it out like this, I hope you can come up with the regex that does this.

Upvotes: 0

Related Questions