user5084957
user5084957

Reputation:

Getting the title of post

I am trying to get the title of a post using simple_html_dom the html roots can be seen below the part I am trying to get is titled This Is Our Title.

<div id="content">
  <div id="section">
    <div id="sectionleft">
      <p>
        Latest News
      </p>
      <ul class="cont news">
        <li>
          <div style="padding: 1px;">
            <a href="http://www.example.com">
              <img src="http://www.example.com/our-image.png" width="128" height="96" alt="">
            </a>
          </div>
          <a href="http://www.example.com" class="name">
            This is our title 
            </a>
          <i class="info">added: Dec 16, 2015</i>
        </li>
      </ul>
    </div>
  </div>
</div>

Currently I have this

$page = (isset($_GET['p'])&&$_GET['p']!=0) ? (int) $_GET['p'] : '';

$html = file_get_html('http://www.example.com/'.$page);

foreach($html->find('div#section ul.cont li div a') as $element)
{
    print '<br><br>';
    echo $url = 'http://www.example.com/'.$element->href;

    $html2 = file_get_html($url);

    print '<br>';

    $image = $html2->find('meta[property=og:image]',0);
    print $image = $image->content;

    print '<br>';

    $title = $html2->find('#sectionleft ul.cont news li a.name',0);
    print $title = $title->plaintext;

    print '<br>';
}

The issue is here $title = $html2->find('#sectionleft ul.cont news li a.name',0); I assume I am using the wrong selector but I am literally not sure what I am doing wrong..

Upvotes: 1

Views: 52

Answers (2)

Rounin
Rounin

Reputation: 29453

If this seems a little hacky, forgive me, but... you can always employ PHP to run a quick .js:

<?php

echo '<script>';
echo 'var postTitle = document.querySelector("ul.cont.news a.name").innerHTML;';
if (!isset($_GET['posttitle'])) {
echo 'window.location.href = window.location.href + "?posttitle=" + postTitle';}
echo '</script>';

$postTitle = $_GET['posttitle'];

?>

Upvotes: 0

gen_Eric
gen_Eric

Reputation: 227200

ul.cont news means "find <news> elements that are a child of ul.cont".

You actually want:

#sectionleft ul.cont.news li a.name

EDIT: For some reason, it seems simple_html_dom doesn't like ul.cont.news even though it's a valid CSS selector.

You can try

#sectionleft ul[class="cont news"] li a.name

which should work as long as the classes are in that order.

Upvotes: 3

Related Questions