candy crush
candy crush

Reputation: 45

how to remove a class using dom in php?

i want to remove the class "refs" that includes references. the page(http://www.sacred-destinations.com/mexico/palenque) from where i m getting the content looks like:

 <div class="col-sm-6 col-md-7" id="essay">
    <section class="refs">
    </section>
    </div><!-- end #essay -->

now i am not getting how to remove this 'refs' class as it is enclosed in "section" like something.. here is something that i have done so far...

<?php
$url="http://www.sacred-destinations.com/mexico/palenque";
 $ch = curl_init($url);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
    curl_setopt($ch, CURLOPT_BINARYTRANSFER, true);
    $html = curl_exec($ch);
    curl_close($ch);
    $newDom = new domDocument;
    libxml_use_internal_errors(true);
    $newDom->loadHTML($html);
    libxml_use_internal_errors(false);
    $newDom->preserveWhiteSpace = false;
    $newDom->validateOnParse = true;
    $sections = $newDom->saveHTML($newDom->getElementById('essay'));
$text=$sections->find('<section class="refs">');
$result=removeClass($text);
echo $result;
?>

Upvotes: 1

Views: 1819

Answers (1)

ThW
ThW

Reputation: 19492

DOMDocument has no find() method, you have to use DOMXPath::evaluate() with XPath expressions.

$dom = new DOMDocument();
libxml_use_internal_errors(true);
$dom->loadHTML($html);
libxml_use_internal_errors(false);
$dom->preserveWhiteSpace = false;
$xpath = new DOMXPath($dom);

$expression = 
  '//div[
     @id="essay"
   ]
   /section[
     contains(
       concat(" ", normalize-space(@class), " "), " refs "
     )
   ]';

foreach ($xpath->evaluate($expression) as $section) {  
  $section->removeAttribute('class');
}
echo $dom->saveHtml();

Class attributes can contain multiple values like classOne classTwo. With normalize-space() the whitespaces will be reduced to single spaces inside the string (start and end removed). concat() add spaces to the start and end. This avoid matching the class name as part of another class name.

In the example the whole class attribute will be removed. To modify it you can read it with DOMElement::getAttribute() and use string functions to change it.

Here are several DOM based libraries that can make HTML manipulation easier.

Upvotes: 2

Related Questions