user1396530
user1396530

Reputation: 19

PHP DOMDocument is not working well

I am trying to fetch the email accounts from a web site using the following code. But it produces an out put like

%3e%61%2f%3c%6d%6f%63%2e%6c%69%61%6d%67%40%6f%63%73%6e%61%69%70%6d%61%20%3e%22%6d%6f%63%2e%6c%69%61%6d%67%40%6f%63%73%6e%61%69%70%6d%61%3a%6f%74%6c%69%61%6d%22%3d%66%65%72%68%20%61%3c %3e%61%2f%3c%6d%6f%63%2e%65%76%69%6c%40%34%34%34%73%71%20%3e%22%6d%6f%63%2e%65%76%69%6c%40%34%34%34%73%71%3a%6f%74%6c%69%61%6d%22%3d%66%65%72%68%20%61%3c %3e%61%2f%3c%6e%69%2e%70%6f%68%73%6c%61%74%69%67%69%64%6e%65%67%79%78%6f%40%6f%66%6e%69%20%3e%22%6e%69%2e%70%6f%68%73%6c%61%74%69%67%69%64%6e%65%67%79%78%6f%40%6f%66%6e%69%3a%6f%74%6c%69%61%6d%22%3d%66%65%72%68%20%61%3c %3e%61%2f%3c%6d%6f%63%2e%73%6e%61%72%6f%6f%6a%6e%61%6d%40%6f%66%6e%69%20%3e%22%6d%6f%63%2e%73%6e%61%72%6f%6f%6a%6e%61%6d%40%6f%66%6e%69%3a%6f%74%6c%69%61%6d%22%3d%66%65%72%68%20%61%3c %3e%61%2f%3c%6d%6f%63%2e%6c%69%61%6d%67%40%6c%65%74%6f%68%61%79%69%72%70%20%3e%22%6d%6f%63%2e%6c%69%61%6d%67%40%6c%65%74%6f%68%61%79%69%72%70%3a%6f%74%6c%69%61%6d%22%3d%66%65%72%68%20%61%3c %3e%61%2f%3c%6d%6f%63%2e%6c%69%61%6d%67%40%61%72%70%61%6e%6e%75%70%68%63%72%75%68%63%79%65%6e%6e%61%79%69%76%20%3e%22%6d%6f%63%2e%6c%69%61%6d%67%40%61%72%70%61%6e%6e%75%70%68%63%72%75%68%63%79%65%6e%6e%61%79%69%76%3a%6f%74%6c%69%61%6d%22%3d%66%65%72%68%20%61%3c %3e%61%2f%3c%6d%6f%63%2e%6c%69%61%6d%67%40%74%6e%65%6d%65%67%6e%61%6d%65%72%6f%63%6e%61%76%61%72%74%20%3e%22%6d%6f%63%2e%6c%69%61%6d%67%40%74%6e%65%6d%65%67%6e%61%6d%65%72%6f%63%6e%61%76%61%72%74%3a%6f%74%6c%69%61%6d%22%3d%66%65%72%68%20%61%3c %3e%61%2f%3c%6d%6f%63%2e%6b%75%65%67%6e%61%68%63%78%65%74%6e%65%6d%79%6f%6c%70%6d%65%65%68%74%40%68%73%65%72%75%73%20%3e%22%6d%6f%63%2e%6b%75%65%67%6e%61%68%63%78%65%74%6e%65%6d%79%6f%6c%70%6d%65%65%68%74%40%68%73%65%72%75%73%3a%6f%74%6c%69%61%6d%22%3d%66%65%72%68%20%61%3c %3e%61%2f%3c%6d%6f%63%2e%6c%69%61%6d%67%40%68%70%6e%6f%6d%62%61%6a%75%73%20%3e%22%6d%6f%63%2e%6c%69%61%6d%67%40%68%70%6e%6f%6d%62%61%6a%75%73%3a%6f%74%6c%69%61%6d%22%3d%66%65%72%68%20%61%3c 

The code I am using is

<?php    
$html = new DOMDocument();
@$html->loadHtmlFile('http://www.quickerala.com/listings?searchString=&districtId=1&go=Go');
$xpath = new DOMXPath( $html );
$nodelist = $xpath->query( "//span[@class='listEmail']" );
foreach ($nodelist as $n){
    echo $n->nodeValue."\n";
}
?>

Upvotes: 1

Views: 69

Answers (1)

Ja͢ck
Ja͢ck

Reputation: 173522

Nope, that's actually correct. You have to manually urldecode it I think.

Correction, you also have to strrev it.

It's probably a spam protection; from their code I manage to lift this nugget of code:

$(".listEmail").each(function(index){
    var strEm = unescape($(this).html());
    var reversedStr=strEm.split("").reverse().join("");
    $(this).html(reversedStr)});
}

So:

$value = strrev(urldecode($n->nodeValue));

That still returns a piece of HTML, so you have to parse that as well :)

Upvotes: 2

Related Questions