Reputation: 4666
$content = '<!--<sup><span style="font-weight:bold;color:black;">0</span></sup><br/>-->
<div class="popular-video-image">
<a href="video/Far+East+Movement - Like+a+G6/w4s6H4ku6ZY/" title="<lang video_go_to=Far East Movement - Like a G6>">
<img src="/images/topvideo/1.jpg" alt=""/>
</a>
<span class="popular-video-artist ellipsis"><a href="video/Far+East+Movement - Like+a+G6/w4s6H4ku6ZY/" title="<lang video_go_to=Far East Movement - Like a G6>" class="ellipsis">Far East Movement</a></span>
<span class="popular-video-title ellipsis"><a href="video/Far+East+Movement - Like+a+G6/w4s6H4ku6ZY/" title="<lang video_go_to=Far East Movement - Like a G6>" class="ellipsis">Like a G6</a></span>
</div>';
$dom = new DOMDocument;
$dom->preserveWhiteSpace = false;
$dom->loadHTML($content);
foreach ($dom->getElementsByTagName('a') as $node)
{
$node->setAttribute('href', 'http://mysite.ru/' . $node->getAttribute('href'));
}
$dom->formatOutput = true;
echo $dom->saveXml($dom->documentElement);
Output:
<html>
<body>
<div class="popular-video-image">
<a href="http://mysite.ru/video/Far+East+Movement - Like+a+G6/w4s6H4ku6ZY/" title="<lang video_go_to=Far East Movement - Like a G6>">
<img src="/images/topvideo/1.jpg" alt=""/></a>
<span class="popular-video-artist ellipsis"><a href="http://mysite.ru/video/Far+East+Movement - Like+a+G6/w4s6H4ku6ZY/" title="<lang video_go_to=Far East Movement - Like a G6>" class="ellipsis">Far East Movement</a></span>
<span class="popular-video-title ellipsis"><a href="http://mysite.ru/video/Far+East+Movement - Like+a+G6/w4s6H4ku6ZY/" title="<lang video_go_to=Far East Movement - Like a G6>" class="ellipsis">Like a G6</a></span>
</div>
</body>
</html>
I do not want to add html and body tags. Also do not want to tag replaced to <lang>
. And
is also unnecessary.
I want to receive such content, which was at the entrance, only with modified links..
Sorry for bad english!
Upvotes: 0
Views: 1781
Reputation: 9408
You are seeing
at the end of each line because your HTML has Windows-style line endings CR+LF
. To get rid of them, run this on it before you feed it into DOMDocument
— to convert them to Unix-style line endings LF
:
$content = preg_replace('/\r\n/', "\n", $content);
Upvotes: 4
Reputation: 4666
<?php
$content = '<!--<sup><span style="font-weight:bold;color:black;">0</span></sup><br/>-->
<div class="popular-video-image">
<a href="video/Far+East+Movement - Like+a+G6/w4s6H4ku6ZY/" title="<lang video_go_to=Far East Movement - Like a G6>">
<img src="/images/topvideo/1.jpg" alt=""/>
</a>
<span class="popular-video-artist ellipsis"><a href="video/Far+East+Movement - Like+a+G6/w4s6H4ku6ZY/" title="<lang video_go_to=Far East Movement - Like a G6>" class="ellipsis">Far East Movement</a></span>
<span class="popular-video-title ellipsis"><a href="video/Far+East+Movement - Like+a+G6/w4s6H4ku6ZY/" title="<lang video_go_to=Far East Movement - Like a G6>" class="ellipsis">Like a G6</a></span>
</div>';
$dom = new DOMDocument;
$dom->preserveWhiteSpace = false;
$dom->loadHTML($content);
foreach ($dom->getElementsByTagName('a') as $node)
{
$node->setAttribute('href', 'http://mysite.ru/' . $node->getAttribute('href'));
}
$dom->formatOutput = true;
echo preg_replace('#^<!DOCTYPE.+?>#', '', str_replace( array('<html>', '</html>', '<body>', '</body>', "\n\n", '<', '>'), array('', '', '', '', '', '<', '>',), $dom->saveHTML()));
Upvotes: 0
Reputation: 7433
saveXml takes an optional parameter to allow you to specify the node to output.
$dom->saveXml($dom->documentElement->firstChild->firstChild);
This will remove the html and body tags from the output.
Upvotes: 3
Reputation: 437326
I guess that the <html>
and <body>
tags get placed in because you are using loadHTML
. Try using loadXML
instead.
As for <lang>
, it has to be replaced because otherwise the resulting XML would not be valid. If it is causing you problems, you should change your approach a little and work with it, not against it.
Upvotes: 0