baumi_
baumi_

Reputation: 7

How to find a h3 tag with a certain value

Well, I have a HTML File with the following structure:

<h3>Heading 1</h3>
  <table>
   <!-- contains a <thead> and <tbody> which also cointain several columns/lines-->
  </table>
<h3>Heading 2</h3>
  <table>
   <!-- contains a <thead> and <tbody> which also cointain several columns/lines-->
  </table>

I want to get JUST the first table with all its content. So I'll load the HTML File

<?php 
  $dom = new DOMDocument();
  libxml_use_internal_errors(true);
  $dom->loadHTML(file_get_contents('http://www.example.com'));
  libxml_clear_errors();
?>

All tables have the same classes and also have NO specific ID's. That's why the only way I could think of was to grab the h3-tag with the value "Heading 1". I already found this one, which works well for me. (Thinking of the fact that other tables and captions could be added leaves the solution as unfavorable)
How could I grab the h3 tag WITH the value "Heading 1"? + How could I select the following table?

EDIT#1: I don't have access to the HTML File, so I can't edit it.
EDIT#2: My Solution (thanks to Martin Henriksen) for now is:

<?php
    $doc = new DOMDocument(1.0);
    libxml_use_internal_errors(true);
    $doc->loadHTML(file_get_contents('http://example.com'));
    libxml_clear_errors();
    foreach($doc->getElementsByTagName('h3') as $element){
      if($element->nodeValue == 'exampleString')
        $table = $element->nextSibling->nextSibling;
        $innerHTML= '';
        $children = $table->childNodes;
        foreach ($children as $child) {
          $innerHTML .= $child->ownerDocument->saveXML( $child );
        }
        echo $innerHTML;
        file_put_contents("test.xml", $innerHTML);
    }
  ?>

Upvotes: 0

Views: 1741

Answers (2)

Fateh Mic&#39;son
Fateh Mic&#39;son

Reputation: 21

You can Find any tag in HTML using simple_html_dom.php class you can download this file from this link https://sourceforge.net/projects/simplehtmldom/?source=typ_redirect

Than

<?php
include_once('simple_html_dom.php');

$htm  = "**YOUR HTML CODE**";
$html = str_get_html($htm);
$h3_tag = $html->find("<h3>",0)->innertext;
echo "HTML code in h3 tag"; 
print_r($h3_tag);
?>

Upvotes: 1

mrhn
mrhn

Reputation: 18926

You can fetch out all the DomElements which the tag h3, and check what value it holds by accessing the nodeValue. When you found the h3 tag, you can select the next element in the DomTree by nextSibling.

foreach($dom->getElementsByTagName('h3') as $element)
{
    if($element->nodeValue == 'Heading 1')
        $table = $element->nextSibling;
}

Upvotes: 0

Related Questions