Googlebot
Googlebot

Reputation: 15683

How to make a private URL?

I want to create a private url as

http://domain.com/content.php?secret_token=XXXXX

Then, only visitors who have the exact URL (e.g. received by email) can see the page. We check the $_GET['secret_token'] before displaying the content.

My problem is that if by any chance search bots find the URL, they will simply index it and the URL will be public. Is there a practical method to avoid bot visits and subsequent index?

Possible But Unfavorable Methods:

  1. Login system (e.g. by php session): But I do not want to offer user login.

  2. Password-protected folder: The problem is as above.

  3. Using Robots.txt: Many search engine bots do not respect it.

Upvotes: 12

Views: 15898

Answers (6)

Tim
Tim

Reputation: 4099

Leaving the link unpublished will be ok in most circumstances...

...However, I will warn you that the prevalence of browser toolbars (Google and Yahoo come to mind) change the game. One company I worked for had pages from their intranet indexed in Google. You could search for the page, and a few results came up, but you couldn't access them unless you were inside our firewall or VPN'd in.

We figured the only way those links got propagated to Google had to be through the toolbar. (If anyone else has a better explanation, I'd love to hear it...) I've been out of that company a while now, so I don't know if they ever figured out definitively what happened there.

I know, strange but true...

Upvotes: 1

Try generating a 5-6 alphanumeric password and attach along with the email, so eventhough robots spider it , they need password to access the page. (Just an extra added safety measure)

Upvotes: 3

CrazyDart
CrazyDart

Reputation: 3801

What you are talking about is security through obscurity. Its never a good idea. If you must, I would offer these thoughts:

  • Make the link expire
  • Lock the link to the C or D class of IPs that it was accessed from the first time
  • Have the page challenge the user with something like a logic question before forwarding to the real page with a time sensitive token (2 step process), and if the challenge fails send a 404 back so the crawler stops.

Upvotes: 7

zzzzBov
zzzzBov

Reputation: 179206

you only need to tell the search engines not to index /content.php, and search engines that honor robots.txt wont index any pages that start with /content.php.

Upvotes: 1

Lg102
Lg102

Reputation: 4908

As long as you don't link to it, no spider will pick it up. And, since you don't want any password protection, the link is going to work for everyone. Consider disabling the secret key after it is used.

Upvotes: 1

Eugen Rieck
Eugen Rieck

Reputation: 65314

  • If there is no link to it (including that the folder has no index view), the robot won't find it
  • You could return a 404, if the token is wrong: This way, a robot (and who else doesn't have the token) will think, there is no such page

Upvotes: 1

Related Questions