Reputation: 117

how to stop google crawl unexisting page

When I was developing my site. I made a typo in one place, for example, all my pages are dir1/dir2/page.htm/par1-par2, but my typo was dir1/dir2/page/par1-par2 (note: without .htm).

It was in production for 1 day only, but Google is keep crawling those links. How to stop Google doing that?

By the way, that's not 1 page, but hundreds or thousands of pages.

Upvotes: 0

Answers (3)

GTSouza

Reputation: 365

Try use robots.txt to deny access to this page (url)

http://www.robotstxt.org/robotstxt.html

http://support.google.com/webmasters/bin/answer.py?hl=en&answer=156449

test robots.txt here : http://www.frobee.com/robots-txt-check/

patterns must begin with / because robots.txt patterns always match absolute URLs. 
* matches zero or more of any character. 
$ at the end of a pattern matches the end of the URL; elsewhere $ matches itself. 
* at the end of a pattern is redundant, because robots.txt patterns always match any URL which begins with the pattern.

Upvotes: 2

Lawrence Cherone

Reputation: 46602

If the page exists (perhaps because your using mod_rewrite) and rendering a custom page not found but not sending a http 410 Gone header header("HTTP/1.0 410 Gone"); then google wont know its been removed and index it just the same.

You need to add the proper headers or remove the page or not render your own 404, so it hits your servers 404, then google will remove the page from the index, also the removal of the page wont happen over night:

You could also add the url to a robots.txt file also this is not guaranteed to remove the page from the index, you could contact google as others have said but then its not guaranteed to get a response or removal.

User-agent: *
Disallow: /dir1/dir2/page/par1-par2

Good luck.

Upvotes: 1

Mike Fulton

Reputation: 918

Google has a form where you can ask it to remove a page from its index.

Check out the info at this link:

http://support.google.com/webmasters/bin/answer.py?hl=en&answer=164734

Upvotes: -1

how to stop google crawl unexisting page

Answers (3)

Related Questions