gtd
gtd

Reputation: 17246

What is a good open source package for building flexible spam detection on a large Rails site?

My site is getting larger and it's starting to attract a lot of spam through various channels. The site has a lot of different types of UGC (profiles, forums, blog comments, status updates, private messages, etc, etc). I have various mitigation efforts underway, which I hope to deploy in a blitzkrieg fashion to convince the spammers that we're not a worthwhile target. I have high confidence in what I'm doing functionality wise, but one missing piece is killing all the old spam all at once.

Here's what I have:

My requirements:

  1. I want it to perform reasonably well given the volume of data (therefore I'm wary of a pure ruby solution).
  2. I should be able to train multiple classifications to different types of content (419-scam vs botnet link spam)
  3. I would like to be able to add manual factors based on our own detective work (pattern matching, IP reuse, etc)
  4. Ultimately I want to construct a nice interface to be used with Ruby. If this requires getting my hands dirty in C or whatever, I can handle it, but I'll avoid it if I can.

I realize this is a long and vague question, but what I'm looking for primarily is just a list of good packages, and secondarily any random thoughts from someone who has built a similiar system about ways to approach it.

Upvotes: 6

Views: 525

Answers (1)

Mori
Mori

Reputation: 27789

We looked for an acceptable open source solution and didn't find one.

If you come to the same conclusion and decide to consider proprietary anti-spam, check out the paid Akismet collaborative spam filtering service. We've had decent performance from it across a dozen medium sized sites. It integrates with rails through rack and rackismet.

Upvotes: 5

Related Questions