Adrian
Adrian

Reputation: 2291

How to search in the source code for a string with php?

I have tried

<?php
  $url = $_POST['attributename'];
  $needtofind = "did not match any documents.  </p>";
  $site = file_get_contents("https://www.google.com/#q=site:$url");
  if(strpos($site, $needtofind) == false) {
    echo 'indexed';
  } else {
    echo 'not indexed';
  } 
  ob_end_clean();
?>

HTML

<div class="center-page">
  <form method="POST">
    <textarea id="float" name="attributename" value=""></textarea><br/>
    <input type="submit" value="Go" />
  </form>
</div>

Codes are on the same page. I just typed them like this to be more clear.

Main problem is that by default it tells me on the screen indexed. If i type any url it will say as well indexed. For example I type the url in the textarea jhbsadhbahsd545.com, it returns indexed when it should have returned not indexed. What have I done wrong?

Upvotes: 1

Views: 193

Answers (2)

Grey Perez
Grey Perez

Reputation: 20438

So you cannot scrape content from Google that way, they actually prohibit you from doing it. You'll need to utilize their API to do what you're needing.

https://developers.google.com/custom-search/json-api/v1/overview

Upvotes: 1

bagonyi
bagonyi

Reputation: 3328

strpos can return 0 which is a falsy value. Compare with ===

strpos($site, $needtofind) === false

However I believe this won't work as Google does not return the string with the first response that you are looking for, but rather lazy loading once the page has been loaded with javascript.

Open up Chrome and view-source:https://www.google.com/#q=site:hopefullythisisadomainthatdoesnotexists.com to check what does Google return and why is it always missing.


Also change the URL you are making the request to from:

https://www.google.com/#q=site:$url

to:

https://www.google.com/search?q=site:$url

Upvotes: 2

Related Questions