Reputation: 1071
I thought I had this working some time back, but I am finding its not working always properly and I am trying to sort why.
$resolver = new URLResolver();
$resolve= $resolver->resolveURL($site);
$resolved = $resolve->getURL();
$parsing = file_get_contents_curl($resolved);
$doc = new DOMDocument();
@$doc->loadHTML($parsing);
$para = $doc->getElementsByTagName('p');
$firstparagraph = $para->item(0)->nodeValue;
echo $firstparagraph;
I would expect the above return the content of the first instance of <p>
. Usually this works, but not always.
Sometimes I am instead getting a return such as:
string(5335) "HTTP/1.1 200 OK Content-Type: text/html; charset=UTF-8 Transfer-Encoding: chunked Connection: keep-alive Keep-Alive: timeout=15 Date: Sat, 26 Jan 2019 18:37:18 GMT Server: Apa..........
This specific output is being returned from https://gener8ads.com/referral/?ref=test
When I get the above output I find if I change to item(1)
it properly returns $firstparagraph
.
I am wondering why this is happening and perhaps I can write a proper check when it does happen so the correct first paragraph is returned. I realize in this instance I could just check the output for HTTP/ and if it exists move on to item(1)
, but I don't know that will sort this problem indefinitely.
So the question is, whats causing it to return this instead of the first <p>
?
Upvotes: 0
Views: 41