Reputation: 3034
So I have a web app where both users and non-users can submit links to content. My web app is a site where users receive donations to content so I make sure that users verify that they are the owner of a website before receiving donations.
So when users submit a link from their own site they get a form with a select field of all the sites they have registered and verified. If users are submitting links for other people's sites they get a form where it's just a text field for an :unclaimed_domain (which is not in the db but an attr_accessor that is used as part of a find_or_create method)
...I hope that's enough background info.
I'm trying to implement a feature that keeps users from submitting links that turn out to be 404 errors.
This is my current unsatisfactory validation (the validation never passes):
require 'net/http'
require 'nokogiri'
require 'open-uri'
def should_not_have_a_404_link
if unclaimed_domain != nil || unclaimed_domain != "" #attr_accessor field
domain = unclaimed_domain
else
domain = Website.find_by_website_id(self.website_id).domain #select_field
end
response = Net::HTTP.start(domain, 80) {|http| http.head("/self.link") }
if response.code == "404"
errors.add(:link, "404 Error")
end
end
Another method I tried was this:
def should_not_have_a_404_link
if unclaimed_domain != nil || unclaimed_domain != "" #attr_accessor field
domain = unclaimed_domain
else
domain = Website.find_by_website_id(self.website_id).domain #select_field
end
begin
Nokogiri::HTML(open("http://#{domain}/self.link"))
rescue
errors.add(:link, "404 error")
end
end
This works when a brand new site is submitted, but doesn't work for an existing site.
I also have a callback called :make_full_link which could be part of the problem
before_create :make_full_link
before_update :make_full_link
def make_full_link
website = Website.find(website_id)
unless self.link.index("http://#{website.domain}")
old_link = self.link
self.link = "http://#{website.domain}/#{old_link}"
end
end
end
Any ideas?
Upvotes: 2
Views: 490
Reputation: 104050
I imagine I could lock your rails processes (all the dynamic content on your site) by hosting my own super-slow http server that sends one or two characters every few minutes and submitting three or four or ten requests. It'd tie up all your queue processors, and no more dynamic content could be served.
I suggest adding a timeout to your requests (if you really want to perform online validity checking) or just moving the validity checking to an offline task run out of cron or on-demand or something.
Further, you're just checking that you're not getting a 200
response code, but e.g. 301
, 302
and 307
are fine, just require more processing on the part of clients.
Rather than adding "404 error"
to your errors
, I suggest adding the actual response code. That'd be way more useful if e.g. access controls have been put in place to allow per-netblock access, and your host is receiving 403
responses, but reporting to the user that it is getting 404
responses. These are very different things and should be reported correctly. (Have you ever seen the useless IE error message "sorry, couldn't connect to the remote host, maybe it is offline, maybe it is firewalled, maybe the DNS doesn't exist, maybe you aren't connected to the internet any more"? I hate it, it gives no details on how to fix the situation.)
Upvotes: 2