Reputation: 1784
Im trying to use simple_html_dom with php to parse a webpage with this tag:
<div class=" row result" id="p_a8a968e2788dad48" data-jk="a8a968e2788dad48" itemscope itemtype="http://schema.org/JobPosting" data-tn-component="organicJob">
where data-tn-component="organicJob" is the identifier I want to parse based on, I cant seem to specify the text in a way that simple_html_dom recognizes.
Ive tried a few things along this line:
<?PHP
include 'simple_html_dom.php';
$f="http://www.indeed.com/jobs?q=Electrician&l=maine";
$html->load_file($f);
foreach($html->find('div[data-tn-component="organicJob"]') as $div)
{
echo $div->innertext ;
}
?>
but the parser doesn't find any of the results, even though i know they are there. Probably I'm not making specifying the thing I find correctly. I'm looking at the API, but I still don't understand how to format the find string. what am I doing wrong?
Upvotes: 1
Views: 183
Reputation: 4202
Your selector is correct but i see other problems in your code
1) you are missing .php
in your include include 'simple_html_dom';
it should be
include '/absolute_path/simple_html_dom.php';
2) to load content through url use file_get_html
function instead $html->load_file($f);
which is wrong as php don't know that $html
is simple_html_dom object
$html = file_get_html('http://www.google.com/');
// then only call
$html->find( ...
3) in your provided link: http://www.indeed.com/jobs?q=Electrician+Helper&l=maine there is no present element with data-tn-component
attribute
so final code should be
include '/absolute_path/simple_html_dom.php';
$html = file_get_html('http://www.indeed.com/jobs?q=Electrician&l=maine');
$html->load_file($f);
foreach($html->find('div[data-tn-component="organicJob"]') as $div)
{
echo $div->innertext ;
}
Upvotes: 1