Shabbir Hussain
Shabbir Hussain

Reputation: 2720

Can a Bayesian network detect spam without spam training set

Hi I have a conceptual question on a system I'm trying to develop that tries to classify emails. I have a large set (>100k) messages that are not spam and a large set of unclassified messages. Is it then possible to use a method (perhaps Bayesian) to detect spam without having a data set of spam? Do I absolutely need to classify spam?

Upvotes: 1

Views: 175

Answers (1)

Josef Borkovec
Josef Borkovec

Reputation: 1079

Yes you can do that. The results will most likely be worse than for a supervised method. The general problem is often referred to as anomaly detection. The idea is to create a model of your data and for each new instance decide whether it comes from this model or not. There are many methods to do that and choosing the right one is difficult. You can start studying here.

Upvotes: 1

Related Questions