Sj03rs
Sj03rs

Reputation: 937

Someone using our site on robots.txt

Some weeks ago, we discovered someone going on our site with the robots.txt directory: http://www.ourdomain.com/robots.txt
I've been doing some research and it said that robots.txt makes the permissions of our search engine? I'm not certain of that...
The reason why I'm asking this is because he is trying to get into that file once again today...
The thing is that we do not have this file on our website... So why is someone trying to access that file? Is it dangerous? Should we be worried?
We have tracked the IP address and it says the location is in Texas, and some weeks ago, it was in Venezuela... Is he using a VPN? Is this a bot?

Can someone explain what this file does and why he is trying to access it?

Upvotes: 0

Views: 107

Answers (1)

unor
unor

Reputation: 96607

In a robots.txt (a simple text file) you can specify which URLs of your site should not be crawled by bots (like search engine crawlers).

The location of this file is fixed so that bots always know where to find the rules: the file named robots.txt has to be placed in the document root of your host. For example, when your site is http://example.com/blog, the robots.txt must be accessible from http://example.com/robots.txt.

Polite bots will always check this file before trying to access your pages; impolite bots will ignore it.

If you don’t provide a robots.txt, polite bots assume that they are allowed to crawl everything. To get rid of the 404s, use this robots.txt (which says the same: all bots are allowed to crawl everything):

User-agent: *
Disallow:

Upvotes: 1

Related Questions