Reputation: 11
I have this code, i get the info but with this i get the data + the link for example
require_once('simple_html_dom.php');
set_time_limit (0);
$html ='www.domain.com';
$html = file_get_html($url);
// i read the first div
foreach($html->find('#content') as $element){
// i read the second
foreach ($element->find('p') as $phone){
echo $phone;
Mobile Pixel 2 - google << there the link
But i need remove these link, the problem is the next, i scrape this:
<p>the info that i really need is here<p>
<p class="text-right"><a class="btn btn-default espbott aplus" role="button"
href="brand/google.html">Google</a></p>
I read this: Simple HTML Dom: How to remove elements? But i cant find the answer
update: if i use this:
foreach ($element->find('p[class="text-right"]');
It will select the links but can't remove scrapped data
Upvotes: 1
Views: 341
Reputation: 4837
Or here a native version:
PHP-CODE
$sHtml = '<p>the info that i really need is here<p>
<p class="text-right"><a class="btn btn-default espbott aplus" role="button"
href="brand/google.html">Google</a></p>';
$sHtml = '<div id="wrapper">' . $sHtml . '</div>';
echo "org:\n";
echo $sHtml;
echo "\n\n";
$doc = new DOMDocument();
$doc->loadHtml($sHtml);
foreach( $doc->getElementsByTagName( 'a' ) as $element ) {
$element->parentNode->removeChild( $element );
}
echo "res:\n";
echo $doc->saveHTML($doc->getElementById('wrapper'));
Output
org:
<div id="wrapper"><p>the info that i really need is here<p>
<p class="text-right"><a class="btn btn-default espbott aplus" role="button"
href="brand/google.html">Google</a></p></div>
res:
<div id="wrapper">
<p>the info that i really need is here</p>
<p>
</p>
<p class="text-right"></p>
</div>
Upvotes: 0
Reputation: 1029
You can use file_get_content with str_get_html and replace it :
include 'simple_html_dom.php';
$content=file_get_contents($url);
$html = str_get_html($content);
// i read the first div
foreach($html->find('#content') as $element){
// i read the second
foreach ($element->find('p[class="text-right"]') as $phone){
$content=str_replace($phone,'',$content);
}
}
print $content;
die;
Upvotes: 1