kush
kush

Reputation: 16928

How to check in the model/controller if form input contains HTML (spam)? Regex?

My app is getting about 1000 spam entries a day. The content is not linked anywhere so the bots effort is completely useless.

But its messing with our metrics and generating tons of encoding errors (the bot is submitting chinese characters).

The fields are simple text boxes and input fields.

What I'd like to do is ban any user who enters html into the field and submits it.

I can handle the banning aspect (logout them out, put boolean in users table) easily.

But I'm not sure how to check if the params contain html and where the cleanest place to check for this is... (before filter? model validation?).

Upvotes: 1

Views: 404

Answers (2)

Tilo
Tilo

Reputation: 33732

you can always use a regexp to filter out html from input, e.g. delete anything between < and > .

input_string.gsub(/<.*>/m, '')   # make sure to use multi-line mode for the RegExp

or check if this matches: (if you want to detect if there was HTML in the input)

input_string =~ /<.*>/m

You could put this in the controller, so it cleans up the input right after it was posted, or you could put this in the validation, so it will fail on save.. Probably better in the controller.

But this will only get you so far - e.g. those bots may still keep posting forms.. which uses up resources on your end..

That's why I'd also recommend you use Google's re-captcha , which is really easy to add to Rails.

With the Captcha, you'll make sure that only humans can post to your site.

http://www.google.com/recaptcha

You can look at some example code for how to integrate ReCaptcha into a Rails project here:

https://github.com/tilo/mail_form_example_with_recaptcha

Upvotes: 1

Dave Newton
Dave Newton

Reputation: 160201

If there's no reason for a < or &lt; just check for angle brackets. If there is a legitimate reason, it'll be a bit more irritating, but probably still limited to checking for angle brackets and any of whatever tags you're trying to avoid.

Upvotes: 0

Related Questions