Bv202
Bv202

Reputation: 4044

Replace iframe with link with regex

I currently have this string:

"<p><iframe allowfullscreen="" class="media-element file-default" data-fid="2219" data-media-element="1" frameborder="0" height="360" src="https://www.youtube.com/embed/sNEJOm4hSaw?feature=oembed" width="640"></iframe></p>"

I'd like to remove the whole iframe element (<iframe>...</iframe>) and replace it with an <a> link to the url in the src attribute:

<p><a href="https://www.youtube.com/embed/sNEJOm4hSaw?feature=oembed">Link to youtube</a></p>

Currently, I have this regex:

$res = preg_replace('/src="(.+?)"/', '/<a href="$1">Link to youtube</a>/', $str);

With this regex, I'm able to replace the src attribute with an a element. However, I'd like to replace the whole iframe element.

What is the easiest way to achieve this?

Upvotes: 2

Views: 5816

Answers (3)

Kaspar Lee
Kaspar Lee

Reputation: 5596

Use this RegEx:

<iframe\s+.*?\s+src=(".*?").*?<\/iframe>

And this Replace:

<a href=$1>Link to youtube</a>

Which gives you the following preg_replace():

$res = preg_replace('/<iframe\s+.*?\s+src=(".*?").*?<\/iframe>/', '/<a href=$1>Link to youtube</a>/', $str);

Live Demo on Regex101


The RegEx captures all the data before and after the src, and then is therefore also replaced.

How it works:

<iframe          # Opening <iframe
\s+              # Whitespace
.*?              # Optional Data (Lazy so as not to capture the src)
\s+              # Whitespace
src=             # src Attribute
    (".*?")          # src Data (i.e. "https://www.example.org")
.*?              # Optional Data (Lazy so as not to capture the closing </iframe>)
<\/iframe>       # Closing </iframe>

Thank to @AlexBor for informing me that the following is slightly more efficient. I would suggest using this RegEx instead:

<iframe\s+.*?\s+src=("[^"]+").*?<\/iframe>

Replaced src=(".*?") (lazy) with src=("[^"]+") (greedy)

Upvotes: 8

mickmackusa
mickmackusa

Reputation: 47900

Using a DOM parser like DOMDocument is not going to let you down. Unlike regex, it is HTML "aware". I'll add some flags to my loadHTML() call to clear out some additional html tag generation, iterate all occurrences of <iframe> tags, create a new <a> element for each occurrence, fill it with the desired values, then replace the <iframe> tag with the new <a> tag.

Code: (Demo)

$html = <<<HTML
<p><iframe allowfullscreen="" class="media-element file-default" data-fid="2219" data-media-element="1" frameborder="0" height="360" src="https://www.youtube.com/embed/sNEJOm4hSaw?feature=oembed" width="640"></iframe></p>
HTML;

$dom = new DOMDocument;
$dom->loadHTML($html, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
foreach ($dom->getElementsByTagName('iframe') as $iframe) {
    $a = $dom->createElement('a');
    $a->setAttribute('href', $iframe->getAttribute('src'));
    $a->nodeValue = "Link to youtube";
    $iframe->parentNode->replaceChild($a, $iframe);
}
echo $dom->saveHTML();

Output:

<p><a href="https://www.youtube.com/embed/sNEJOm4hSaw?feature=oembed">Link to youtube</a></p>

Upvotes: 1

nass
nass

Reputation: 347

The easiest way would be to take out the src attribute with preg_match() and then use it to create a element.

Example:

$string = "<p><iframe allowfullscreen=\"\" class=\"media-element file-default\" data-fid=\"2219\" data-media-element=\"1\" frameborder=\"0\" height=\"360\" src=\"https://www.youtube.com/embed/sNEJOm4hSaw?feature=oembed\" width=\"640\"></iframe></p>\n";

if( preg_match( '#src=\\"([^ ]*)\\"#', $string, $matches ) === 1 ){
    $string = '<a href="' . $matches[ 1 ] . '">Link to youtube</a>';
    echo $string;
}

// outputs <a href="https://www.youtube.com/embed/sNEJOm4hSaw?feature=oembed">Link to youtube</a>

Upvotes: 0

Related Questions